Theme: The post-v1.7.0 accuracy tail. This ADR is opened by the first
decision of the cycle and grows as the release lands; per-release decisions
fold in here rather than spawning topic ADRs. The through-line is
dmc_dma_during_read4 / nessy #20: the host-side seam (D1) plus the two
CPU-side DMC-DMA steal-timing fixes (D2, D3) that, with nessy's host conflict
formula, finally converge dma_2007_read.
D1 — Tagged DMA reads (DmaReadBus) for the 2A03 DMA-read open-bus glitch (#481)¶
Context: The DMC-DMA-during-internal-register-read glitch
(dmc_dma_during_read4, nessy #20) only manifests when a DMC sample fetch
lands on an internal register $4000-$401F. Those reads are invisible to
external chips, so the 2A03 latches an open-bus / internal-register bus
conflict instead of a real value, and $4016/$4017 fetches delete a
controller bit. chippy's DMA loop (ProcessPendingDma, #376) issued bare,
untagged Bus.Read(addr) calls, so the host (nessy) could not tell a DMC
sample fetch from an ordinary CPU read and could not apply the DMA-specific
semantics. An earlier "minimal $4015 read" attempt failed for exactly this
reason — no DMA context on the read. Open-bus state itself is host-owned:
nessy already sees every external read/write and can latch the last bus value
without any CPU help.
Decision: Add an optional DmaReadBus Bus extension —
ReadDma(addr uint16, kind DmaKind) byte — with a DmaKind tag
(DmaDummyRead / DmaSpriteRead / DmaDmcRead). When c.Bus implements it,
ProcessPendingDma routes its reads through ReadDma; otherwise it falls
back to Bus.Read, byte-for-byte identical to today. The assertion is cached
at SetBus (a dmaBus field, mirroring the existing busTicker cache) so
the 256-read sprite loop pays no per-read type-assert. The CPU contributes
only the tag — no open-bus latch, mask, or prevReadValue moves into
cpu.CPU; the host owns all of it. The exact conflict formula
(GetOpenBusMask, controller bit-deletion) is reconstructed and validated in
nessy against a MesenCE (NesCpu::ProcessDmaRead) cycle-by-cycle reference —
chippy cannot validate it, having no NES peripherals.
Consequences: Additive, minor-bump-safe public API; zero hot-path and
zero behavior change for non-NES variants and for hosts that don't implement
the interface. chippy ships the seam plus unit tests (a fake DmaReadBus
asserting each read's tag, and a plain-Bus fallback proving identical
behavior); full dmc_dma_during_read4 convergence is a downstream nessy task
gated on the MesenCE reference. The seam follows chippy's established
host-hook pattern (DMCFetcher, PPURunner, the per-access hook) — the CPU
exposes a typed extension point and the consumer supplies the
platform-specific behavior.
D2 — idle() polls ProcessPendingDma so a DMA halt drains on dummy-read cycles (#493)¶
Context: With D1's seam wired, dma_2007_read still hung: its
self-calibrating poll loop only escapes when the single-byte DMC DMA steals
its cycle coincident with the $4015/$2007 read, and chippy's steal was
landing one CPU cycle late. A from-boot (PC, cycle) diff against a headless
MesenCE reference (built make core + pgohelper, per-instruction trace hook
on NesCpu::Exec) showed the two emulators bit-identical for 62,741
instructions, then diverging at exactly one steal: chippy halted at the
taken branch's target ($E062, even cycle → 3-cycle steal) where MesenCE
halted on the branch's dummy-read cycle ($E078, odd → 4-cycle steal). A
3-cycle steal pins the loop period to the DMC sample period (8 × 428 = 3424),
so it never phase-drifts onto the read. The earlier "cumulative cycle-parity
offset" hypothesis was falsified by the bit-identical prefix.
Decision:idle() (the per-cycle dummy/internal-read path — addressing
dummies, the taken-branch dummy read branch() → c.idle(c.PC), etc.) now
polls ProcessPendingDma at the top when needHalt is set, exactly as
busRead already did. Mesen polls on every CPU cycle; chippy was missing the
poll on idle cycles, so a halt armed going into an idle drained only at the
next real read. Gated by c.nesCycle (VariantNES + bus ticker) and
needHalt.
Consequences: Structurally limited to the NES DMA-halt path — non-NES
variants have nesCycle == false, so Klaus, Tom Harte (6502 + per-cycle bus
trace, wdc65c02), and cpu_interrupts are untouched (full suite green,
-count=1). Regression test TestIdle_DrainsPendingDmaHalt pins that a halt
armed into an idle cycle drains there. Downstream: dma_2007_read now reaches
the same $E72F terminal state as MesenCE.
D3 — getCycle parity uses the true cycle, not the stale instruction-boundary count (#493)¶
Context:ProcessPendingDma's getCycle decides whether a DMC read fires
this cycle or eats an alignment cycle first — it sets the 3-vs-4-cycle steal
length. It read c.Cycles & 1, but c.Cycles only advances at the
instruction boundary (exec.go folds instrCycles in after the opcode
completes); mid-instruction it is stale by instrCycles. Mesen's
_cycleCount ticks every cycle, so its getCycle is the true CPU parity.
Decision:getCycle := (c.Cycles + uint64(c.instrCycles)) & 1 == 0.
Opcode-fetch steals (instrCycles == 0) are unchanged; operand-read steals
(instrCycles > 0, the dma_2007_read case) now pick the correct alignment.
Consequences: Prerequisite for D2's convergence (a steal on an operand
read must size correctly once it lands on the right cycle).
TestProcessPendingDma_StealParityUsesInstrCycles pins steal length to true
parity. cpu_interrupts_v2 / apu_test unaffected (they steal at
instrCycles == 0).