Per-chirp T/R switching is owned by the FPGA plfm_chirp_controller
driving adar_tr_x pins (TR_SOURCE=1 in REG_SW_CONTROL, already set by
initializeSingleDevice). The MCU's SPI RMW path via fastTXMode/
fastRXMode/pulseTXMode/pulseRXMode/setADTR1107Control was:
(a) architecturally redundant — raced the FPGA-driven TR line,
(b) toggled the wrong bit (TR_SOURCE instead of TR_SPI),
(c) in setFastSwitchMode(true) bundled a datasheet-violating
PA+LNA-simultaneously-biased side effect.
Removed methods and their backing state (fast_switch_mode_,
switch_settling_time_us_). Call sites in executeChirpSequence /
runRadarPulseSequence updated to rely on the FPGA chirp FSM (GPIOD_8
new_chirp trigger unchanged).
Tests: adds CMSIS-Core DWT/CoreDebug/SystemCoreClock stubs to
stm32_hal_mock so F-4.7's DWT-based delayUs() compiles under the host
mock build. SystemCoreClock=0 makes the busy-wait exit immediately.
The `broadcast=1` path on adarWrite() emitted the 0x08 broadcast opcode
but setChipSelect() only asserts one device's CS line, so only the single
selected chip ever saw the frame. The opcode path has also never been
validated on silicon. Until a HIL test confirms multi-CS semantics, route
broadcast=1 through a unicast loop over all devices so caller intent
(all four take the write) is preserved and the dead opcode path becomes
unreachable. Logs a DIAG_WARN on entry for visibility.
Addresses the remaining actionable items from
docs/DEVELOP_AUDIT_2026-04-19.md after commit 3f47d1e.
XDC (dead waivers — F-0.4, F-0.5, F-0.6, F-0.7):
- ft_clkout_IBUF CLOCK_DEDICATED_ROUTE now uses hierarchical filter;
flat net name did not exist post-synth.
- reset_sync_reg[*] false-path rewritten to walk hierarchy and filter
on CLR/PRE pins.
- adc_clk_mmcm.xdc ft601_clk_in references replaced with foreach-loop
over real USB clock names, gated on -quiet existence.
- MMCM LOCKED waiver uses REF_PIN_NAME filter instead of the
previously-missing u_core/ literal path.
CDC (F-1.1, F-1.2, F-1.3):
- Documented the quasi-static-bus stability invariant above the
FT601 cmd_valid toggle block.
- cdc_adc_to_processing gains an `overrun` output; the two CIC->FIR
instances feed a sticky cdc_cic_fir_overrun flag surfaced on
gpio_dig5 so silent sample drops become visible to the MCU.
- Removed the dead mixers_enable synchronizer in ddc_400m.v; the _sync
output was unused and every caller ties the port to 1'b1.
Diagnostics (F-6.4):
- range_bin_decimator watchdog_timeout plumbed through receiver
and top-level, OR'd into gpio_dig5.
ADAR (F-4.7):
- delayUs() replaced with DWT cycle counter; self-initialising
TRCENA/CYCCNTENA, overflow-safe unsigned subtraction.
Regression: tb_cdc_modules.v 57/57 passes under iverilog after
the cdc_modules.v change. Remote Vivado verification in progress.
Addresses findings from docs/DEVELOP_AUDIT_2026-04-19.md:
P0 source-level:
- F-4.3 ADAR1000_Manager::adarSetTxPhase now writes REG_LOAD_WORKING
with LD_WRK_REGS_LDTX_OVERRIDE (0x02) instead of 0x01. Previous value
toggled the LDRX latch on a TX-phase write, so host TX phase updates
never reached the working registers.
- F-6.1 DDC mixer_saturation / filter_overflow / diagnostics were deleted
at the receiver boundary. Now plumbed to new outputs on
radar_receiver_final (ddc_overflow_any, ddc_saturation_count) and
aggregated into gpio_dig5 in radar_system_top. Added mark_debug
attributes for ILA visibility. Test/debug inputs tied low explicitly.
- F-0.8 adc_clk_mmcm.xdc set_clock_uncertainty: removed invalid -add
flag (Vivado silently rejected it, applying zero guardband). Now uses
absolute 0.150 ns which covers 53 ps jitter + ~100 ps PVT margin.
P1:
- F-4.2 adarSetBit / adarResetBit reject broadcast=ON — the RMW sampled
a single device but wrote to all four, clobbering the other three's
state.
- F-4.4 initializeSingleDevice returns false and leaves initialized=false
when scratchpad verification fails; previously marked the device
initialized anyway so downstream PA enable could drive a dead bus.
- F-6.2 FIR I/Q filter_overflow ports, previously unconnected, now OR'd
into the module-level filter_overflow output.
- F-6.3 mti_canceller exposes 8-bit saturation counter. Saturation was
previously invisible and produces spurious Doppler harmonics.
Verification:
- 27/27 iverilog testbenches pass
- 228/228 pytest pass (cross-layer contract + cosim)
- MCU unit tests 51/51 + 24/24 pass
- Remote Vivado 2025.2 build: bitstream writes; 400 MHz mixer pipeline
now shows WNS -0.109 ns which MATCHES the audit's F-0.9 prediction
that the design only closed because F-0.8's guardband was silently
dropped. ft_clkout F-0.9 remains a show-stopper (requires MRCC pin
move), tracked separately.
Not addressed in this PR (larger scope, follow-up tickets):
F-0.4, F-0.5, F-0.6, F-0.7, F-0.9, F-1.1, F-1.2, F-2.2, F-3.2, F-4.1,
F-4.7, F-6.4, F-6.5.
Three conflicts — all resolved in favor of develop, which has a more
refined version of the same work this branch introduced:
- radar_system_top.v: develop's cleaner USB_MODE=1 comment (same value).
- run_regression.sh: develop's ${SYSTEM_RTL[@]} refactor + added
USB_MODE=1 test variants.
- tb/radar_system_tb.v: develop's ifdef USB_MODE_1 to dump the correct
USB instance based on mode.
The 400 MHz reset fan-out fix (nco_400m_enhanced, cic_decimator_4x_enhanced,
ddc_400m) and ADAR1000 channel-indexing fix remain intact on this branch.
The four channel-indexed ADAR1000 setters (adarSetRxPhase, adarSetTxPhase,
adarSetRxVgaGain, adarSetTxVgaGain) computed their register offset as
`(channel & 0x03) * stride`, which silently aliased CH4 (channel=4 ->
mask=0) onto CH1 and shifted CH1..CH3 by one. The API contract (1-based
CH1..CH4) is documented in ADAR1000_AGC.cpp:76 and matches the ADI
datasheet; every existing caller already passes `ch + 1`.
Fix: subtract 1 before masking -- `((channel - 1) & 0x03) * stride` --
and reject `channel < 1 || channel > 4` early with a DIAG message so a
future stale 0-based caller fails loudly instead of writing to CH4.
Adds TestTier1Adar1000ChannelRegisterRoundTrip (9 tests) which closes
the loop independently of the driver:
- parses the ADI register map directly from ADAR1000_Manager.h,
- verifies the datasheet stride invariants (gain=1, phase=2),
- auto-discovers every C++ TU under MCU_LIB_DIR / MCU_CODE_DIR so a
new caller cannot silently escape the round-trip check,
- asserts every caller's channel argument evaluates to {1,2,3,4} for
ch in {0,1,2,3} (catches bare 0-based or literal-0 callers at CI
time before the runtime bounds-check would silently drop them),
- round-trips each (caller, ch) through the helper arithmetic and
checks the final address equals REG_CH{ch+1}_*.
Adversarially validated: reverting any one helper, all four helpers,
corrupting the parsed register map, injecting a bare-ch caller, and
auto-discovering a literal-0 caller in a fresh TU each cause the
expected (and only the expected) test to fail.
Stacked on fix/adar1000-vm-tables (PR #107).
The ADAR1000 vector-modulator I/Q lookup tables VM_I[128] and VM_Q[128]
were declared but defined as empty initialiser lists since the first
commit (5fbe97f). Every call to adarSetRxPhase / adarSetTxPhase therefore
wrote (I=0x00, Q=0x00) to registers 0x21/0x23 (Rx) and 0x32/0x34 (Tx)
regardless of the requested phase state, leaving beam steering completely
non-functional in firmware.
This commit:
* Populates VM_I[128] and VM_Q[128] from ADAR1000 datasheet Rev. B
Tables 13-16 (p.34) on a uniform 2.8125 deg grid (360 / 128 states).
Byte format: bits[7:6] reserved 0, bit[5] polarity (1 = positive
lobe), bits[4:0] 5-bit unsigned magnitude - exactly as specified.
* Removes VM_GAIN[128] declaration and (empty) definition. The
ADAR1000 has no separate VM gain register; per-channel VGA gain is
set via CHx_RX_GAIN (0x10-0x13) / CHx_TX_GAIN (0x1C-0x1F) by
adarSetRxVgaGain / adarSetTxVgaGain. VM_GAIN was never populated,
never read anywhere in the firmware, and its presence falsely
suggested a missing scaling step in the signal path.
* Adds 9_Firmware/tests/cross_layer/adar1000_vm_reference.py: an
independently-derived ground-truth module containing the full
datasheet table plus byte-format / uniform-grid / quadrant-symmetry
/ cardinal-point invariant checkers and a tolerant C array parser.
* Adds TestTier2Adar1000VmTableGroundTruth (9 tests) to
test_cross_layer_contract.py, including a tokenising C/C++
comment+string stripper used by the VM_GAIN reintroduction guard,
and an adversarial self-test that corrupts one byte and asserts
the comparison detects it (defends against silent bypass via
future fixture/parser refactors).
Adversarially validated: removing the firmware definitions, flipping
a single byte, or reintroducing VM_GAIN as code each cause the suite
to fail; restoring causes it to pass. VM_GAIN appearing inside string
literals or comments correctly does NOT trip the guard.
Closes the empty-table half of the ADAR1000 phase-control bug class.
The separate channel-rotation issue (#90) will be addressed in a
follow-up PR.
Refs: 7_Components Datasheets and Application notes/ADAR1000.pdf
Rev. B Tables 13-16 p.34
The firmware uses the C++ ADAR1000_Manager class exclusively. The C-style
driver pair (adar1000.c, 693 LoC; adar1000.h, 294 LoC) has no external
call sites:
grep -rn "Adar_Set|Adar_Read|Adar_Write|Adar_Soft" 9_Firmware
grep -rn "AdarDevice|AdarBiasCurrents|AdarDeviceInfo" 9_Firmware
Both return hits only inside adar1000.c/h themselves. ADAR1000_Manager.h
has its own copies of REG_CH1_*, REG_INTERFACE_CONFIG_A, etc. and does
not include adar1000.h. main.cpp had a lone #include "adar1000.h" but
referenced no symbols from it; the REG_* macros it uses resolve through
ADAR1000_Manager.h on the next line.
No behaviour change: the deleted code was unreachable.
Side note on #90: adar1000.c contained a second copy of the
REG_CH1_* + (channel & 0x03) channel-rotation pattern tracked in #90
(lines 349, 397-398, 472, 520-521). This commit does not fix#90 --
the live path in ADAR1000_Manager.cpp still needs the channel-index
fix -- but it removes the dormant copy so the bug has one less place
to hide.
Verification:
- 9_Firmware/9_1_Microcontroller/tests: make clean && make -> all passing
(51/51 UM982 GPS, 24/24 driver, 13/13 ADAR1000_AGC, bugs #1-15, Gap-3
fixes 1-5, safety fixes)
- 9_Firmware/tests/cross_layer: 29 passed
- grep -rn "adar1000\.h|adar1000\.c|Adar_|AdarDevice" 9_Firmware: 0 hits
Resolve cross-layer AGC control mismatch where opcode 0x28 only
controlled the FPGA inner-loop AGC but the STM32 outer-loop AGC
(ADAR1000_AGC) ran independently with its own enable state.
FPGA: Drive gpio_dig6 from host_agc_enable instead of tied low,
making the FPGA register the single source of truth for AGC state.
MCU: Change ADAR1000_AGC constructor default from enabled(true) to
enabled(false) so boot state matches FPGA reset default (AGC off).
Read DIG_6 GPIO every frame with 2-frame confirmation debounce to
sync outerAgc.enabled — prevents single-sample glitch from causing
spurious AGC state transitions.
Tests: Update MCU unit tests for new default, add 6 cross-layer
contract tests verifying the FPGA-MCU-GUI AGC invariant chain.
- Add ERROR_COUNT sentinel to SystemError_t enum
- Change error_strings[] to static const char* const
- Add static_assert to enforce enum/array sync at compile time
- Add runtime bounds check with fallback for invalid error codes
- Add all missing test binary names to .gitignore
The gap3, agc, and gps test binaries (Mach-O executables compiled on macOS)
were accidentally tracked. CI runs on Linux and fails with 'Exec format error'.
Removed from index and added to .gitignore.
FPGA-001: The previous fix derived frame boundaries from chirp_counter==0,
but that counter comes from plfm_chirp_controller_enhanced which overflows
to N (not wrapping at chirps_per_elev). This caused frame pulses only on
6-bit rollover (every 64 chirps) instead of every N chirps. Now wires the
CDC-synchronized tx_new_chirp_frame_sync signal from the transmitter into
radar_receiver_final, giving correct per-frame timing for any N.
STM32-004: Changed ad9523_init() failure path from Error_Handler() to
return -1, matching the pattern used by ad9523_setup() and ad9523_status()
in the same function. Both halt the system, but return -1 keeps IRQs
enabled for diagnostic output.
checkSystemHealth()'s internal watchdog (pre-fix step 9) had two linked
defects that, combined with the previous commit's escalation of
ERROR_WATCHDOG_TIMEOUT to Emergency_Stop(), would false-latch AERIS-10:
1. Cold-start false trip:
static uint32_t last_health_check = 0;
if (HAL_GetTick() - last_health_check > 60000) { trip; }
On the first call, last_health_check == 0, so the subtraction
against a seeded-zero sentinel exceeds 60 000 ms as soon as the MCU
has been up >60 s -- normal after the ADAR1000 / AD9523 / ADF4382
init sequence -- and the watchdog trips spuriously.
2. Stale timestamp after early returns:
last_health_check = HAL_GetTick(); // at END of function
Every earlier sub-check (IMU, BMP180, GPS, PA Idq, temperature) has
an `if (fault) return current_error;` path that skips the update.
After ~60 s of transient faults, the next clean call compares
against a long-stale last_health_check and trips.
With ERROR_WATCHDOG_TIMEOUT now escalating to Emergency_Stop(), either
failure mode would cut the RF rails on a perfectly healthy system.
Fix: move the watchdog check to function ENTRY. A dedicated cold-start
branch seeds the timestamp on the first call without checking. On every
subsequent call, the elapsed delta is captured first and
last_health_check is updated BEFORE any sub-check runs, so early returns
no longer leave a stale value. 32-bit tick-wrap semantics are preserved
because the subtraction remains on uint32_t.
Add test_gap3_health_watchdog_cold_start.c covering cold-start, paced
main-loop, stall detection, boundary (exactly 60 000 ms), recovery
after trip, and 32-bit HAL_GetTick() wrap -- wired into tests/Makefile
alongside the existing gap-3 safety tests.
STM32-006: Remove blocking do-while loop that waited for legacy GUI start
flag — production V7 PyQt GUI never sends it, hanging the MCU at boot.
STM32-004: Check ad9523_init() return code and call Error_Handler() on
failure, matching the pattern used by all other hardware init calls.
FPGA-001: Simplify frame boundary detection to only trigger on
chirp_counter wrap-to-zero. Previous conditions checking == N and == 2N
were unreachable dead code (counter wraps at N-1). Now correct for any
chirps_per_elev value.
- Rename ERROR_STEPPER_FAULT → ERROR_STEPPER_MOTOR to match main.cpp enum
- Update critical-error predicate to include ERROR_TEMPERATURE_HIGH and
ERROR_WATCHDOG_TIMEOUT (was testing stale pre-fix logic)
- Test 4 now asserts overtemp DOES trigger e-stop (previously asserted opposite)
- Add Test 5 (watchdog triggers e-stop) and Test 6 (memory alloc does not)
- Add ERROR_MEMORY_ALLOC and ERROR_WATCHDOG_TIMEOUT to local enum
- 7 tests, all pass
The previous commit accidentally introduced the literal 2-byte sequence
'\r' at the end of two backslash-continuation lines (TESTS_STANDALONE
and the .PHONY list). GNU make on Linux treats that as text rather than
a line continuation, which orphans the following line with leading
spaces and aborts CI with:
Makefile:68: *** missing separator (did you mean TAB instead of 8 spaces?)
Strip the extraneous 'r' so each continuation ends with a real backslash
+ LF.
handleSystemError() only called Emergency_Stop() for error codes in
[ERROR_RF_PA_OVERCURRENT .. ERROR_POWER_SUPPLY] (9..13). Two critical
faults were left out of the gate and fell through to attemptErrorRecovery()'s
default log-and-continue branch:
- ERROR_TEMPERATURE_HIGH (14): raised by checkSystemHealth() when the
hottest of 8 PA thermal sensors exceeds 75 C. Without cutting bias
(DAC CLR) and the PA 5V0/5V5/RFPA_VDD rails, the 10 W GaN QPA2962
stages remain biased in an overtemperature state -- a thermal-runaway
path in AERIS-10E.
- ERROR_WATCHDOG_TIMEOUT (16): indicates the health-check loop has
stalled (>60 s since last pass). Transmitter state is unknown;
relying on IWDG to reset the MCU re-runs startup and re-energises
the PA rails rather than latching the safe state.
Fix: extend the critical-error predicate so these two codes also trigger
Emergency_Stop(). Add test_gap3_overtemp_emergency_stop.c covering all
17 SystemError_t values (must-trigger and must-not-trigger), wired into
tests/Makefile alongside the existing gap-3 safety tests.
Implements the STM32 outer-loop AGC (ADAR1000_AGC) that reads the FPGA
saturation flag on DIG_5/PD13 once per radar frame and adjusts the
ADAR1000 VGA common gain across all 16 RX channels.
Phase 4 — ADAR1000_AGC class (new files):
- ADAR1000_AGC.h/.cpp: attack/recovery/holdoff logic, per-channel
calibration offsets, effectiveGain() with OOB safety
- test_agc_outer_loop.cpp: 13 tests covering saturation, holdoff,
recovery, clamping, calibration, SPI spy, reset, mixed sequences
Phase 5 — main.cpp integration:
- Added #include and global outerAgc instance
- AGC update+applyGain call between runRadarPulseSequence() and
HAL_IWDG_Refresh() in main loop
Build system & shim fixes:
- Makefile: added CXX/CXXFLAGS, C++ object rules, TESTS_WITH_CXX in
ALL_TESTS (21 total tests)
- stm32_hal_mock.h: const uint8_t* for HAL_UART_Transmit (C++ compat),
__NOP() macro for host builds
- shims/main.h + real main.h: FPGA_DIG5_SAT pin defines
All tests passing: MCU 21/21, GUI 92/92, cross-layer 29/29.
Bug 1 (FPGA): status_words[0] was 37 bits (8+3+2+5+3+16), silently
truncated to 32. Restructured to {0xFF, mode[1:0], stream[2:0],
3'b000, threshold[15:0]} = 32 bits exactly. Fixed in both
usb_data_interface_ft2232h.v and usb_data_interface.v.
Bug 2 (Python): radar_mode extracted at bit 21 but was actually at
bit 24 after truncation — always returned 0. Updated shift/mask in
parse_status_packet() to match new layout (mode>>22, stream>>19).
Bug 3 (STM32): parseFromUSB() minimum size check was 74 bytes but
9 doubles + uint32 + markers = 82 bytes. Buffer overread on last
fields when 74-81 bytes passed.
All 166 tests pass (29 cross-layer, 92 GUI, 20 MCU, 25 FPGA).
B12: PA IDQ calibration loop condition inverted (< 0.2 -> > 0.2) for both DAC1/DAC2
B13: DAC2 ADC buffer mismatch — reads from hadc2 now correctly stored to adc2_readings
B14: DIAG_SECTION macro call sites changed from 2-arg to 1-arg form (4 sites)
B15: htim3 definition + MX_TIM3_Init() added (PWM mode, CH2+CH3, Period=999)
B16: Removed stale NO-OP annotation on TriggerTimedSync (already fixed in Bug #3)
B17: Updated stale GPIO-only warnings to reflect TIM3 PWM implementation (Bug #5)
All 15 tests pass (11 original + 4 new for B12-B15).
Bug #11: platform_noos_stm32.c used HAL_SPI_Transmit instead of
HAL_SPI_TransmitReceive — reads returned garbage. Changed to in-place
full-duplex. Dead code (never called), fixed per audit recommendation.
Test added: test_bug11_platform_spi_transmit_only.c. Mock infrastructure
updated with SPI spy types. All 11 firmware tests pass.
FPGA B2: Migrated long_chirp_lut[0:3599] from ~700 lines of hardcoded
assignments to BRAM with (* ram_style = "block" *) attribute and
$readmemh("long_chirp_lut.mem"). Added sync-only read block for proper
BRAM inference. 1-cycle read latency introduced. short_chirp_lut left
as distributed RAM (60 entries, too small for BRAM).
FPGA B3: Added BREG (window_val_reg) and MREG (mult_i_raw/mult_q_raw)
pipeline stages to doppler_processor.v. Eliminates DPIP-1 and DPOP-2
DRC warnings. S_LOAD_FFT retimed: fft_input_valid starts at sub=2,
+1 cycle total latency. BREG primed in S_PRE_READ at no extra cost.
Both FPGA files compile clean with Icarus Verilog.
Bug #9: Both TX and RX SPI init params had platform_ops = NULL, causing
adf4382_init() -> no_os_spi_init() to fail with -EINVAL. Fixed by setting
platform_ops = &stm32_spi_ops and passing stm32_spi_extra with correct CS
port/pin for each device.
Bug #10: stm32_spi_write_and_read() never toggled chip select. Since TX
and RX ADF4382A share SPI4, every register write hit both PLLs. Rewrote
stm32_spi.c to assert CS LOW before transfer and deassert HIGH after,
using stm32_spi_extra metadata. Backward-compatible: legacy callers
(e.g., AD9523) with cs_port=NULL skip CS management.
Also widened chip_select from uint8_t to uint16_t in no_os_spi.h since
STM32 GPIO_PIN_xx values (e.g., GPIO_PIN_14=0x4000) overflow uint8_t.
10/10 tests pass (8 original + 2 new regression tests).
Observe-before-fix instrumentation for bench bring-up: adds timestamped
DIAG logging to the AD9523 clock config, ADF4382A LO manager, power
sequencing, lock monitoring, temperature monitoring, and safe-mode entry.
Annotates known bugs (double ad9523_setup call, timed-sync init ordering,
TriggerTimedSync no-op, phase-shift before init-check, last_check timer
collision) without changing any runtime behavior.