Skip to content

ADR 0011 — v1.8.0: accuracy tail — 2A03 DMA-read open-bus seam + DMC-DMA steal timing

  • Status: Accepted (2026-06-26)
  • Release: v1.8.0 (shipped 2026-06-26)
  • Theme: The post-v1.7.0 accuracy tail. This ADR is opened by the first decision of the cycle and grows as the release lands; per-release decisions fold in here rather than spawning topic ADRs. The through-line is dmc_dma_during_read4 / nessy #20: the host-side seam (D1) plus the two CPU-side DMC-DMA steal-timing fixes (D2, D3) that, with nessy's host conflict formula, finally converge dma_2007_read.

D1 — Tagged DMA reads (DmaReadBus) for the 2A03 DMA-read open-bus glitch (#481)

  • Context: The DMC-DMA-during-internal-register-read glitch (dmc_dma_during_read4, nessy #20) only manifests when a DMC sample fetch lands on an internal register $4000-$401F. Those reads are invisible to external chips, so the 2A03 latches an open-bus / internal-register bus conflict instead of a real value, and $4016/$4017 fetches delete a controller bit. chippy's DMA loop (ProcessPendingDma, #376) issued bare, untagged Bus.Read(addr) calls, so the host (nessy) could not tell a DMC sample fetch from an ordinary CPU read and could not apply the DMA-specific semantics. An earlier "minimal $4015 read" attempt failed for exactly this reason — no DMA context on the read. Open-bus state itself is host-owned: nessy already sees every external read/write and can latch the last bus value without any CPU help.
  • Decision: Add an optional DmaReadBus Bus extension — ReadDma(addr uint16, kind DmaKind) byte — with a DmaKind tag (DmaDummyRead / DmaSpriteRead / DmaDmcRead). When c.Bus implements it, ProcessPendingDma routes its reads through ReadDma; otherwise it falls back to Bus.Read, byte-for-byte identical to today. The assertion is cached at SetBus (a dmaBus field, mirroring the existing busTicker cache) so the 256-read sprite loop pays no per-read type-assert. The CPU contributes only the tag — no open-bus latch, mask, or prevReadValue moves into cpu.CPU; the host owns all of it. The exact conflict formula (GetOpenBusMask, controller bit-deletion) is reconstructed and validated in nessy against a MesenCE (NesCpu::ProcessDmaRead) cycle-by-cycle reference — chippy cannot validate it, having no NES peripherals.
  • Consequences: Additive, minor-bump-safe public API; zero hot-path and zero behavior change for non-NES variants and for hosts that don't implement the interface. chippy ships the seam plus unit tests (a fake DmaReadBus asserting each read's tag, and a plain-Bus fallback proving identical behavior); full dmc_dma_during_read4 convergence is a downstream nessy task gated on the MesenCE reference. The seam follows chippy's established host-hook pattern (DMCFetcher, PPURunner, the per-access hook) — the CPU exposes a typed extension point and the consumer supplies the platform-specific behavior.

D2 — idle() polls ProcessPendingDma so a DMA halt drains on dummy-read cycles (#493)

  • Context: With D1's seam wired, dma_2007_read still hung: its self-calibrating poll loop only escapes when the single-byte DMC DMA steals its cycle coincident with the $4015/$2007 read, and chippy's steal was landing one CPU cycle late. A from-boot (PC, cycle) diff against a headless MesenCE reference (built make core + pgohelper, per-instruction trace hook on NesCpu::Exec) showed the two emulators bit-identical for 62,741 instructions, then diverging at exactly one steal: chippy halted at the taken branch's target ($E062, even cycle → 3-cycle steal) where MesenCE halted on the branch's dummy-read cycle ($E078, odd → 4-cycle steal). A 3-cycle steal pins the loop period to the DMC sample period (8 × 428 = 3424), so it never phase-drifts onto the read. The earlier "cumulative cycle-parity offset" hypothesis was falsified by the bit-identical prefix.
  • Decision: idle() (the per-cycle dummy/internal-read path — addressing dummies, the taken-branch dummy read branch()c.idle(c.PC), etc.) now polls ProcessPendingDma at the top when needHalt is set, exactly as busRead already did. Mesen polls on every CPU cycle; chippy was missing the poll on idle cycles, so a halt armed going into an idle drained only at the next real read. Gated by c.nesCycle (VariantNES + bus ticker) and needHalt.
  • Consequences: Structurally limited to the NES DMA-halt path — non-NES variants have nesCycle == false, so Klaus, Tom Harte (6502 + per-cycle bus trace, wdc65c02), and cpu_interrupts are untouched (full suite green, -count=1). Regression test TestIdle_DrainsPendingDmaHalt pins that a halt armed into an idle cycle drains there. Downstream: dma_2007_read now reaches the same $E72F terminal state as MesenCE.

D3 — getCycle parity uses the true cycle, not the stale instruction-boundary count (#493)

  • Context: ProcessPendingDma's getCycle decides whether a DMC read fires this cycle or eats an alignment cycle first — it sets the 3-vs-4-cycle steal length. It read c.Cycles & 1, but c.Cycles only advances at the instruction boundary (exec.go folds instrCycles in after the opcode completes); mid-instruction it is stale by instrCycles. Mesen's _cycleCount ticks every cycle, so its getCycle is the true CPU parity.
  • Decision: getCycle := (c.Cycles + uint64(c.instrCycles)) & 1 == 0. Opcode-fetch steals (instrCycles == 0) are unchanged; operand-read steals (instrCycles > 0, the dma_2007_read case) now pick the correct alignment.
  • Consequences: Prerequisite for D2's convergence (a steal on an operand read must size correctly once it lands on the right cycle). TestProcessPendingDma_StealParityUsesInstrCycles pins steal length to true parity. cpu_interrupts_v2 / apu_test unaffected (they steal at instrCycles == 0).