Three conflicts — all resolved in favor of develop, which has a more
refined version of the same work this branch introduced:
- radar_system_top.v: develop's cleaner USB_MODE=1 comment (same value).
- run_regression.sh: develop's ${SYSTEM_RTL[@]} refactor + added
USB_MODE=1 test variants.
- tb/radar_system_tb.v: develop's ifdef USB_MODE_1 to dump the correct
USB instance based on mode.
The 400 MHz reset fan-out fix (nco_400m_enhanced, cic_decimator_4x_enhanced,
ddc_400m) and ADAR1000 channel-indexing fix remain intact on this branch.
The four channel-indexed ADAR1000 setters (adarSetRxPhase, adarSetTxPhase,
adarSetRxVgaGain, adarSetTxVgaGain) computed their register offset as
`(channel & 0x03) * stride`, which silently aliased CH4 (channel=4 ->
mask=0) onto CH1 and shifted CH1..CH3 by one. The API contract (1-based
CH1..CH4) is documented in ADAR1000_AGC.cpp:76 and matches the ADI
datasheet; every existing caller already passes `ch + 1`.
Fix: subtract 1 before masking -- `((channel - 1) & 0x03) * stride` --
and reject `channel < 1 || channel > 4` early with a DIAG message so a
future stale 0-based caller fails loudly instead of writing to CH4.
Adds TestTier1Adar1000ChannelRegisterRoundTrip (9 tests) which closes
the loop independently of the driver:
- parses the ADI register map directly from ADAR1000_Manager.h,
- verifies the datasheet stride invariants (gain=1, phase=2),
- auto-discovers every C++ TU under MCU_LIB_DIR / MCU_CODE_DIR so a
new caller cannot silently escape the round-trip check,
- asserts every caller's channel argument evaluates to {1,2,3,4} for
ch in {0,1,2,3} (catches bare 0-based or literal-0 callers at CI
time before the runtime bounds-check would silently drop them),
- round-trips each (caller, ch) through the helper arithmetic and
checks the final address equals REG_CH{ch+1}_*.
Adversarially validated: reverting any one helper, all four helpers,
corrupting the parsed register map, injecting a bare-ch caller, and
auto-discovering a literal-0 caller in a fresh TU each cause the
expected (and only the expected) test to fail.
Stacked on fix/adar1000-vm-tables (PR #107).
The ADAR1000 vector-modulator I/Q lookup tables VM_I[128] and VM_Q[128]
were declared but defined as empty initialiser lists since the first
commit (5fbe97f). Every call to adarSetRxPhase / adarSetTxPhase therefore
wrote (I=0x00, Q=0x00) to registers 0x21/0x23 (Rx) and 0x32/0x34 (Tx)
regardless of the requested phase state, leaving beam steering completely
non-functional in firmware.
This commit:
* Populates VM_I[128] and VM_Q[128] from ADAR1000 datasheet Rev. B
Tables 13-16 (p.34) on a uniform 2.8125 deg grid (360 / 128 states).
Byte format: bits[7:6] reserved 0, bit[5] polarity (1 = positive
lobe), bits[4:0] 5-bit unsigned magnitude - exactly as specified.
* Removes VM_GAIN[128] declaration and (empty) definition. The
ADAR1000 has no separate VM gain register; per-channel VGA gain is
set via CHx_RX_GAIN (0x10-0x13) / CHx_TX_GAIN (0x1C-0x1F) by
adarSetRxVgaGain / adarSetTxVgaGain. VM_GAIN was never populated,
never read anywhere in the firmware, and its presence falsely
suggested a missing scaling step in the signal path.
* Adds 9_Firmware/tests/cross_layer/adar1000_vm_reference.py: an
independently-derived ground-truth module containing the full
datasheet table plus byte-format / uniform-grid / quadrant-symmetry
/ cardinal-point invariant checkers and a tolerant C array parser.
* Adds TestTier2Adar1000VmTableGroundTruth (9 tests) to
test_cross_layer_contract.py, including a tokenising C/C++
comment+string stripper used by the VM_GAIN reintroduction guard,
and an adversarial self-test that corrupts one byte and asserts
the comparison detects it (defends against silent bypass via
future fixture/parser refactors).
Adversarially validated: removing the firmware definitions, flipping
a single byte, or reintroducing VM_GAIN as code each cause the suite
to fail; restoring causes it to pass. VM_GAIN appearing inside string
literals or comments correctly does NOT trip the guard.
Closes the empty-table half of the ADAR1000 phase-control bug class.
The separate channel-rotation issue (#90) will be addressed in a
follow-up PR.
Refs: 7_Components Datasheets and Application notes/ADAR1000.pdf
Rev. B Tables 13-16 p.34
The firmware uses the C++ ADAR1000_Manager class exclusively. The C-style
driver pair (adar1000.c, 693 LoC; adar1000.h, 294 LoC) has no external
call sites:
grep -rn "Adar_Set|Adar_Read|Adar_Write|Adar_Soft" 9_Firmware
grep -rn "AdarDevice|AdarBiasCurrents|AdarDeviceInfo" 9_Firmware
Both return hits only inside adar1000.c/h themselves. ADAR1000_Manager.h
has its own copies of REG_CH1_*, REG_INTERFACE_CONFIG_A, etc. and does
not include adar1000.h. main.cpp had a lone #include "adar1000.h" but
referenced no symbols from it; the REG_* macros it uses resolve through
ADAR1000_Manager.h on the next line.
No behaviour change: the deleted code was unreachable.
Side note on #90: adar1000.c contained a second copy of the
REG_CH1_* + (channel & 0x03) channel-rotation pattern tracked in #90
(lines 349, 397-398, 472, 520-521). This commit does not fix#90 --
the live path in ADAR1000_Manager.cpp still needs the channel-index
fix -- but it removes the dormant copy so the bug has one less place
to hide.
Verification:
- 9_Firmware/9_1_Microcontroller/tests: make clean && make -> all passing
(51/51 UM982 GPS, 24/24 driver, 13/13 ADAR1000_AGC, bugs #1-15, Gap-3
fixes 1-5, safety fixes)
- 9_Firmware/tests/cross_layer: 29 passed
- grep -rn "adar1000\.h|adar1000\.c|Adar_|AdarDevice" 9_Firmware: 0 hits
Resolve cross-layer AGC control mismatch where opcode 0x28 only
controlled the FPGA inner-loop AGC but the STM32 outer-loop AGC
(ADAR1000_AGC) ran independently with its own enable state.
FPGA: Drive gpio_dig6 from host_agc_enable instead of tied low,
making the FPGA register the single source of truth for AGC state.
MCU: Change ADAR1000_AGC constructor default from enabled(true) to
enabled(false) so boot state matches FPGA reset default (AGC off).
Read DIG_6 GPIO every frame with 2-frame confirmation debounce to
sync outerAgc.enabled — prevents single-sample glitch from causing
spurious AGC state transitions.
Tests: Update MCU unit tests for new default, add 6 cross-layer
contract tests verifying the FPGA-MCU-GUI AGC invariant chain.
- Add ERROR_COUNT sentinel to SystemError_t enum
- Change error_strings[] to static const char* const
- Add static_assert to enforce enum/array sync at compile time
- Add runtime bounds check with fallback for invalid error codes
- Add all missing test binary names to .gitignore
The gap3, agc, and gps test binaries (Mach-O executables compiled on macOS)
were accidentally tracked. CI runs on Linux and fails with 'Exec format error'.
Removed from index and added to .gitignore.
FPGA-001: The previous fix derived frame boundaries from chirp_counter==0,
but that counter comes from plfm_chirp_controller_enhanced which overflows
to N (not wrapping at chirps_per_elev). This caused frame pulses only on
6-bit rollover (every 64 chirps) instead of every N chirps. Now wires the
CDC-synchronized tx_new_chirp_frame_sync signal from the transmitter into
radar_receiver_final, giving correct per-frame timing for any N.
STM32-004: Changed ad9523_init() failure path from Error_Handler() to
return -1, matching the pattern used by ad9523_setup() and ad9523_status()
in the same function. Both halt the system, but return -1 keeps IRQs
enabled for diagnostic output.
checkSystemHealth()'s internal watchdog (pre-fix step 9) had two linked
defects that, combined with the previous commit's escalation of
ERROR_WATCHDOG_TIMEOUT to Emergency_Stop(), would false-latch AERIS-10:
1. Cold-start false trip:
static uint32_t last_health_check = 0;
if (HAL_GetTick() - last_health_check > 60000) { trip; }
On the first call, last_health_check == 0, so the subtraction
against a seeded-zero sentinel exceeds 60 000 ms as soon as the MCU
has been up >60 s -- normal after the ADAR1000 / AD9523 / ADF4382
init sequence -- and the watchdog trips spuriously.
2. Stale timestamp after early returns:
last_health_check = HAL_GetTick(); // at END of function
Every earlier sub-check (IMU, BMP180, GPS, PA Idq, temperature) has
an `if (fault) return current_error;` path that skips the update.
After ~60 s of transient faults, the next clean call compares
against a long-stale last_health_check and trips.
With ERROR_WATCHDOG_TIMEOUT now escalating to Emergency_Stop(), either
failure mode would cut the RF rails on a perfectly healthy system.
Fix: move the watchdog check to function ENTRY. A dedicated cold-start
branch seeds the timestamp on the first call without checking. On every
subsequent call, the elapsed delta is captured first and
last_health_check is updated BEFORE any sub-check runs, so early returns
no longer leave a stale value. 32-bit tick-wrap semantics are preserved
because the subtraction remains on uint32_t.
Add test_gap3_health_watchdog_cold_start.c covering cold-start, paced
main-loop, stall detection, boundary (exactly 60 000 ms), recovery
after trip, and 32-bit HAL_GetTick() wrap -- wired into tests/Makefile
alongside the existing gap-3 safety tests.
STM32-006: Remove blocking do-while loop that waited for legacy GUI start
flag — production V7 PyQt GUI never sends it, hanging the MCU at boot.
STM32-004: Check ad9523_init() return code and call Error_Handler() on
failure, matching the pattern used by all other hardware init calls.
FPGA-001: Simplify frame boundary detection to only trigger on
chirp_counter wrap-to-zero. Previous conditions checking == N and == 2N
were unreachable dead code (counter wraps at N-1). Now correct for any
chirps_per_elev value.
- Rename ERROR_STEPPER_FAULT → ERROR_STEPPER_MOTOR to match main.cpp enum
- Update critical-error predicate to include ERROR_TEMPERATURE_HIGH and
ERROR_WATCHDOG_TIMEOUT (was testing stale pre-fix logic)
- Test 4 now asserts overtemp DOES trigger e-stop (previously asserted opposite)
- Add Test 5 (watchdog triggers e-stop) and Test 6 (memory alloc does not)
- Add ERROR_MEMORY_ALLOC and ERROR_WATCHDOG_TIMEOUT to local enum
- 7 tests, all pass
The previous commit accidentally introduced the literal 2-byte sequence
'\r' at the end of two backslash-continuation lines (TESTS_STANDALONE
and the .PHONY list). GNU make on Linux treats that as text rather than
a line continuation, which orphans the following line with leading
spaces and aborts CI with:
Makefile:68: *** missing separator (did you mean TAB instead of 8 spaces?)
Strip the extraneous 'r' so each continuation ends with a real backslash
+ LF.
handleSystemError() only called Emergency_Stop() for error codes in
[ERROR_RF_PA_OVERCURRENT .. ERROR_POWER_SUPPLY] (9..13). Two critical
faults were left out of the gate and fell through to attemptErrorRecovery()'s
default log-and-continue branch:
- ERROR_TEMPERATURE_HIGH (14): raised by checkSystemHealth() when the
hottest of 8 PA thermal sensors exceeds 75 C. Without cutting bias
(DAC CLR) and the PA 5V0/5V5/RFPA_VDD rails, the 10 W GaN QPA2962
stages remain biased in an overtemperature state -- a thermal-runaway
path in AERIS-10E.
- ERROR_WATCHDOG_TIMEOUT (16): indicates the health-check loop has
stalled (>60 s since last pass). Transmitter state is unknown;
relying on IWDG to reset the MCU re-runs startup and re-energises
the PA rails rather than latching the safe state.
Fix: extend the critical-error predicate so these two codes also trigger
Emergency_Stop(). Add test_gap3_overtemp_emergency_stop.c covering all
17 SystemError_t values (must-trigger and must-not-trigger), wired into
tests/Makefile alongside the existing gap-3 safety tests.
Implements the STM32 outer-loop AGC (ADAR1000_AGC) that reads the FPGA
saturation flag on DIG_5/PD13 once per radar frame and adjusts the
ADAR1000 VGA common gain across all 16 RX channels.
Phase 4 — ADAR1000_AGC class (new files):
- ADAR1000_AGC.h/.cpp: attack/recovery/holdoff logic, per-channel
calibration offsets, effectiveGain() with OOB safety
- test_agc_outer_loop.cpp: 13 tests covering saturation, holdoff,
recovery, clamping, calibration, SPI spy, reset, mixed sequences
Phase 5 — main.cpp integration:
- Added #include and global outerAgc instance
- AGC update+applyGain call between runRadarPulseSequence() and
HAL_IWDG_Refresh() in main loop
Build system & shim fixes:
- Makefile: added CXX/CXXFLAGS, C++ object rules, TESTS_WITH_CXX in
ALL_TESTS (21 total tests)
- stm32_hal_mock.h: const uint8_t* for HAL_UART_Transmit (C++ compat),
__NOP() macro for host builds
- shims/main.h + real main.h: FPGA_DIG5_SAT pin defines
All tests passing: MCU 21/21, GUI 92/92, cross-layer 29/29.
Bug 1 (FPGA): status_words[0] was 37 bits (8+3+2+5+3+16), silently
truncated to 32. Restructured to {0xFF, mode[1:0], stream[2:0],
3'b000, threshold[15:0]} = 32 bits exactly. Fixed in both
usb_data_interface_ft2232h.v and usb_data_interface.v.
Bug 2 (Python): radar_mode extracted at bit 21 but was actually at
bit 24 after truncation — always returned 0. Updated shift/mask in
parse_status_packet() to match new layout (mode>>22, stream>>19).
Bug 3 (STM32): parseFromUSB() minimum size check was 74 bytes but
9 doubles + uint32 + markers = 82 bytes. Buffer overread on last
fields when 74-81 bytes passed.
All 166 tests pass (29 cross-layer, 92 GUI, 20 MCU, 25 FPGA).
B12: PA IDQ calibration loop condition inverted (< 0.2 -> > 0.2) for both DAC1/DAC2
B13: DAC2 ADC buffer mismatch — reads from hadc2 now correctly stored to adc2_readings
B14: DIAG_SECTION macro call sites changed from 2-arg to 1-arg form (4 sites)
B15: htim3 definition + MX_TIM3_Init() added (PWM mode, CH2+CH3, Period=999)
B16: Removed stale NO-OP annotation on TriggerTimedSync (already fixed in Bug #3)
B17: Updated stale GPIO-only warnings to reflect TIM3 PWM implementation (Bug #5)
All 15 tests pass (11 original + 4 new for B12-B15).
Bug #11: platform_noos_stm32.c used HAL_SPI_Transmit instead of
HAL_SPI_TransmitReceive — reads returned garbage. Changed to in-place
full-duplex. Dead code (never called), fixed per audit recommendation.
Test added: test_bug11_platform_spi_transmit_only.c. Mock infrastructure
updated with SPI spy types. All 11 firmware tests pass.
FPGA B2: Migrated long_chirp_lut[0:3599] from ~700 lines of hardcoded
assignments to BRAM with (* ram_style = "block" *) attribute and
$readmemh("long_chirp_lut.mem"). Added sync-only read block for proper
BRAM inference. 1-cycle read latency introduced. short_chirp_lut left
as distributed RAM (60 entries, too small for BRAM).
FPGA B3: Added BREG (window_val_reg) and MREG (mult_i_raw/mult_q_raw)
pipeline stages to doppler_processor.v. Eliminates DPIP-1 and DPOP-2
DRC warnings. S_LOAD_FFT retimed: fft_input_valid starts at sub=2,
+1 cycle total latency. BREG primed in S_PRE_READ at no extra cost.
Both FPGA files compile clean with Icarus Verilog.
Bug #9: Both TX and RX SPI init params had platform_ops = NULL, causing
adf4382_init() -> no_os_spi_init() to fail with -EINVAL. Fixed by setting
platform_ops = &stm32_spi_ops and passing stm32_spi_extra with correct CS
port/pin for each device.
Bug #10: stm32_spi_write_and_read() never toggled chip select. Since TX
and RX ADF4382A share SPI4, every register write hit both PLLs. Rewrote
stm32_spi.c to assert CS LOW before transfer and deassert HIGH after,
using stm32_spi_extra metadata. Backward-compatible: legacy callers
(e.g., AD9523) with cs_port=NULL skip CS management.
Also widened chip_select from uint8_t to uint16_t in no_os_spi.h since
STM32 GPIO_PIN_xx values (e.g., GPIO_PIN_14=0x4000) overflow uint8_t.
10/10 tests pass (8 original + 2 new regression tests).
Observe-before-fix instrumentation for bench bring-up: adds timestamped
DIAG logging to the AD9523 clock config, ADF4382A LO manager, power
sequencing, lock monitoring, temperature monitoring, and safe-mode entry.
Annotates known bugs (double ad9523_setup call, timed-sync init ordering,
TriggerTimedSync no-op, phase-shift before init-check, last_check timer
collision) without changing any runtime behavior.