RTL fixes discovered via new end-to-end testbench:
- plfm_chirp_controller: TX/RX mixer enables now mutually exclusive
by FSM state (Fix#4), preventing simultaneous TX+RX activation
- usb_data_interface: stream control reset default 3'b001 (range-only),
added doppler/cfar data_pending sticky flags, write FSM triggers on
range_valid only — eliminates startup deadlock (Fix#5)
- radar_receiver_final: STM32 toggle signals wired through for mode-00
pass-through, dynamic frame detection via host_chirps_per_elev
- radar_system_top: STM32 toggle signal wiring to receiver instance
- chirp_memory_loader_param: explicit readmemh range for short chirp
Test infrastructure:
- New tb_system_e2e.v: 46 checks across 12 groups (reset, TX, safety,
RX, USB R/W, CDC, beam scanning, reset recovery, stream control,
latency budgets, watchdog)
- tb_usb_data_interface: Tests 21/22/56 updated for data_pending
architecture (preload flags, verify consumption instead of state)
- tb_chirp_controller: mixer tests T7.1/T7.2 updated for Fix#4
- run_regression.sh: PASS/FAIL regex fixed to match only [PASS]/[FAIL]
markers, added E2E test entry
- Updated rx_final_doppler_out.csv golden data
Opt 1: Eliminated ST_BF_SHIFT state — arithmetic right-shift is pure
bit-selection (zero logic levels), merged into BF_WRITE combinational
add/subtract. Saves LOG2N * N/2 = 5120 cycles per 1024-pt FFT.
Opt 2: Replaced idx_val * tw_stride_reg general multiply with
idx_val << (LOG2N-1-stage) barrel shift. tw_stride_reg is always a
power of 2, so this is mathematically identical and frees a multiplier.
Regression: 18/18 FPGA pass (bit-exact results).
Adds two-layer lint pass (iverilog -Wall + custom static checks) that
catches part-select OOB errors and case-without-default warnings before
pushing to remote Vivado. Catches the exact Synth 8-524 class error that
broke Build 18 initial attempt. Lint errors abort regression; warnings
are advisory. Regenerated golden data for BRAM-migrated matched filter.
Vivado 2025.2 (Synth 8-524): overlap_copy_count is 8-bit but [9:0]
part-select was 10-bit. Changed to explicit zero-extend concat.
Added default case to FSM to suppress non-full case warning.
- Changed user_led/system_status IOSTANDARD from LVCMOS25 to LVCMOS33
to match VIOTB=3.3V needed for FT601 on Bank 16
- Added register init value for hb_counter
- Added comments documenting clock source (50 MHz FIFO0CLK at U20, Bank 14)
and expected LED toggle rates
Maps all 47 FT601 signals through FMC LPC J10 to correct FPGA pins:
- DATA[31:0] + D_CLK: Bank 15 (LA17-LA33)
- BE_N[3:0], control, status: Bank 16 (LA00-LA15)
Both banks share VIOTB rail — set to 3.3V for LVCMOS33.
Includes timing constraints and RTL adaptation notes.
Bug #11: platform_noos_stm32.c used HAL_SPI_Transmit instead of
HAL_SPI_TransmitReceive — reads returned garbage. Changed to in-place
full-duplex. Dead code (never called), fixed per audit recommendation.
Test added: test_bug11_platform_spi_transmit_only.c. Mock infrastructure
updated with SPI spy types. All 11 firmware tests pass.
FPGA B2: Migrated long_chirp_lut[0:3599] from ~700 lines of hardcoded
assignments to BRAM with (* ram_style = "block" *) attribute and
$readmemh("long_chirp_lut.mem"). Added sync-only read block for proper
BRAM inference. 1-cycle read latency introduced. short_chirp_lut left
as distributed RAM (60 entries, too small for BRAM).
FPGA B3: Added BREG (window_val_reg) and MREG (mult_i_raw/mult_q_raw)
pipeline stages to doppler_processor.v. Eliminates DPIP-1 and DPOP-2
DRC warnings. S_LOAD_FFT retimed: fft_input_valid starts at sub=2,
+1 cycle total latency. BREG primed in S_PRE_READ at no extra cost.
Both FPGA files compile clean with Icarus Verilog.
Reduce routing pressure on CIC/NCO critical paths and move Doppler BRAM read-address registers to sync-reset datapath logic so Build 13 closes with stronger setup/hold slack while preserving functional behavior.
CDC fixes across 6 RTL files based on post-implementation report_cdc analysis:
- P0: sync stm32_mixers_enable and new_chirp_pulse to clk_120m via toggle CDC
in radar_transmitter, add ft601 reset synchronizer and USB holding
registers with proper edge detection in usb_data_interface
- P1: add ASYNC_REG to edge_detector, convert new_chirp_frame to toggle CDC,
fix USB valid edge detect to use fully-synced signal
- P2: register Gray encoding in cdc_adc_to_processing source domain, sync
ft601_txe and stm32_mixers_enable for status_reg in radar_system_top
- Safety: add in_bin_count overflow guard in range_bin_decimator to prevent
downstream BRAM corruption
All 13 regression test suites pass (159 individual tests).
DSP48E1 lookup table quantization causes dithering near zero crossings
at low frequencies (1 MHz), producing ~11 sign transitions vs ~5 expected.
Widen accepted range from [3,8] to [3,15].
- Expand ft601_be from [1:0] to [3:0] across RTL, top-level, testbenches,
and XDC (uncomment be[2:3] pin assignments B21/A21)
- Fix NCO XSim testbench: correct reset check (0x7FFF not 0), add pipeline
warmup and sample skip for DSP48E1 quadrature test
- All local regression tests pass (39/39 USB, 10/10 integration, all co-sim)
New testbenches:
- tb_latency_buffer.v: 13/13 tests for BRAM delay line (P1-3)
- tb_cdc_modules.v: 27/27 tests for all 3 CDC primitives (P1-4)
- tb_ad9484_xsim.v: XSim testbench for AD9484 with Xilinx BUFIO/IDDR
- tb_nco_xsim.v: XSim testbench for NCO DSP48E1 path verification
Fixes:
- tb_usb_data_interface.v: updated test 33 from divide-by-2 check to
ODDR-style clock forwarding verification (39/39 pass)
- rx_final_doppler_out.csv: updated golden reference after bug fixes
P0-1: nco_400m_enhanced.v — DSP48E1 OPMODE corrected from PCIN to P
feedback (was routing stale cascade data into accumulator)
P0-2: radar_receiver_final.v — removed same-clock CDC that corrupted
ADC data path between ad9484_interface and DDC
P1-5: fir_lowpass.v — fixed zero replication count in coefficient
symmetric extension ({0{1'b0}} is empty, now uses explicit 0)
Also updates .gitignore to exclude debug/scratch artifacts.
All 30+ testbenches pass (unit, co-sim, integration).
Previously segment 0 only filled positions [0:895], leaving [896:1023]
as zeros from the initial block. These zeros propagated into the overlap
region of subsequent segments, corrupting the convolution.
Fix: change buffer-full threshold from SEGMENT_ADVANCE (896) to
BUFFER_SIZE (1024). Add zero-padding for the last segment when chirp
data runs out before the buffer is full. Updated testbench accordingly.
Verified: multi-segment 32/32 PASS, integration test 10/10 PASS.
Previously the S_IDLE->S_ACCUMULATE transition consumed one data_valid
cycle without writing to BRAM, losing the first sample. The testbench
worked around this by sending sample[0] twice.
Fix: drive mem_we + data capture in S_IDLE on the transition cycle and
advance write_range_bin to 1. Testbench workaround removed.
Verified: 3/3 Doppler co-sim BIT-PERFECT, integration test 10/10 PASS.
RTL fix: matched_filter_multi_segment.v ST_WAIT_FFT now waits for
processing chain to complete ALL 1024 outputs and return to IDLE
before advancing to next segment. Previously, it transitioned on the
first fft_pc_valid edge, causing the chain to still be outputting
while multi-seg started collecting data for the next segment. This
broke the handshake and caused permanent deadlock after segment 0.
Also fixes forward reference of sample_addr_from_chain in
radar_receiver_final.v (declaration moved before first use).
New files:
- tb/tb_radar_receiver_final.v: P0 integration test for full RX
pipeline (ADC->DDC->MF->range_bin_decimator->doppler), 10 checks
- tb/ad9484_interface_400m_stub.v: behavioral ADC stub for iverilog
All existing tests still pass: multi-seg 32/32, MF co-sim 3/3,
Doppler co-sim 14/14.
Bug: rounding logic 'adc_i <= ddc_i[17:2] + ddc_i[1]' overflows when
ddc_i[17:2]=0x7FFF and ddc_i[1]=1, causing 0x7FFF+1=0x8000 (sign flip
from max positive to most negative value).
Fix: add explicit saturation — clamp to 0x7FFF when truncated value is
max positive and round bit is set. Negative values cannot overflow since
rounding only moves toward zero.
New testbench: tb_ddc_input_interface.v with 26 tests covering rounding,
truncation, overflow saturation at positive boundary, negative full scale,
valid synchronization, and sync error detection.