Skip to content

chippy — Project Context Dump

Snapshot of the running understanding of this project. Generated 2026-05-11; last refreshed 2026-06-30 (post-v1.10.0). Treat this as a handoff document — anything not visible from git log or the code itself should live here.

0. Where we are right now (session handoff)

  • v1.10.0 shipped 2026-06-30 (tag v1.10.0, ADR 0013-v1.10.0.md, #505 + #507). Bank-aware 24-bit bus for the 65816 — kills the bank-0 mirror. Minor bump — additive cpu.Banked24 + bank-aware inspection; 6502/65C02/2A03 paths untouched. cpu.Banked24 routes bank 0 through the 16-bit MMIO/WBus/RAM chain and backs banks 1-255 with a flat 16 MB store; wired at cmd/chippy/main.go (replacing Bus24From16) + the DAP launch path (which never set a 24-bit bus for 65816 — latent nil panic). Four layers: core Banked24; loader Intel HEX type-04 (Extended Linear Address) → banks >0 via Options.Bus24 (.bin/.prg/.o stay bank-0); DAP peekByte24/writeByte24 + memMax() clamp lift + AttachConfig.Banked/Server.SetBanked; TUI Source.ReadMemory(uint32), memory panel MemViewBank:MemViewAddr, :bank N selector, $BB:XXXX addresses, bank>0 edits, persisted MemViewBank. Watchpoints/peripherals stay bank-0 (no cross-bank MMIO). D2 cross-bank disassembly (#507) doneDisasmCPUAt/WalkBackAt (bus-explicit), DAP bankView over peekByte24 + 24-bit disassemble reference, TUI Source.Disassemble(uint32) anchored at PBR:PC, $BB:XXXX rows + (bank $NN) title; symbols/source/data ranges stay bank-0. Remaining follow-up: remote 65816 over DAP (RemoteSource = bank 0).
  • v1.6.0 shipped 2026-06-16 (tag v1.6.0, GitHub release + Homebrew cask live, ADR 0009-v1.6.0.md). Closed epic #438: ARR decimal (#424), 238/238 6502 bus-exact (#428), 65C02 Tom Harte + 5 CMOS fixes (#426), struct overlay watch (#409), DAP array children (#410), chippy-state dirtyRanges (#440), goreleaser cask (#413).
  • v1.9.0 shipped 2026-06-29 (tag v1.9.0, ADR 0012-v1.9.0.md). Accuracy tail: per-cycle 65816 bus trace (TestHarte65816BusTrace, #495) completed — chunks 2–4 + full pin-string validation. All 256 opcodes in both emulation and native are per-cycle bus-exact (addr + value + the 8 pin bits VDA/VPA/VPB/RWB/E/M/X/MLB — stricter than the 6502/65C02 traces) except four in harteBusSkip816 (WAI/STP None-address halt, MVN/MVP whole-block-move model). Minor bump — test-only + step816-path internal-cycle emission; state/count harness and 8-bit cores unaffected.
  • v1.8.0 shipped 2026-06-26 (tag v1.8.0, ADR 0011-v1.8.0.md). Accuracy tail closing dmc_dma_during_read4 / nessy #20: D1 the DmaReadBus tagged-DMA-read seam (#481/#492); D2 idle() polls ProcessPendingDma so a DMA halt drains on a taken-branch dummy-read cycle (#493/#497) — root-caused by a from-boot (PC, cycle) diff vs a headless MesenCE reference (bit-identical 62,741 instructions, then chippy halts at the branch target $E062/even/steal-3 where MesenCE halts on the branch dummy read $E078/odd/steal-4); D3 true-cycle getCycle ((Cycles+instrCycles)&1, #493/#494). Also 65816 per-cycle bus-trace chunk 1 (#495/#496). With nessy's host-side dmaBus conflict formula, dma_2007_read now reaches the same $E72F terminal as MesenCE. Minor bump — DmaReadBus is additive; D2/D3 are NES-DMA-path behavior fixes (non-NES variants gated out, Klaus/Harte/Lorenz green).
  • v1.7.0 shipped 2026-06-25 (tag v1.7.0, ADR 0010-v1.7.0.md). Closed epic #458 + all sub-issues: full Tom Harte-validated WDC 65C816 core (#456, 256 opcodes emulation + native; #462 native folded in), TUI-via-DAP render+control flip (#449–452, #461, #471), freeze beyond RAM (#463), DAP data breakpoints + setVariable (#453, #454), 65C02 per-cycle bus trace (#455, #475), hosted WASM playground drag-drop (#457). Minor bump — TUI flip is internal/-only so the public Go API stayed additive.
  • Theme A (TUI-via-DAP migration + default flip, the v2.0 arc): #449 stack→stackTrace, #450 flags→variables, #451 memory→readMemory+dirtyRanges, #452 disasm→disassemble, then #461 flip default to DAP-only + delete the dead direct-render path (depends on #449–#452).
  • Theme B (DAP parity): #453 data breakpoints (setDataBreakpoints/dataBreakpointInfo, expose TUI mem-watch), #454 setVariable on memory + Globals array children.
  • Theme C (accuracy): #455 TestHarte65C02BusTrace (CMOS per-cycle bus-exactness, mirror #428).
  • Theme D (headline): #456 65816 variant (16-bit, emu/native), #462 65816 native-mode completeness + full Tom Harte 65816 (depends on #456), #457 hosted WASM playground (Pages).
  • Theme E (freeze): #463 MMIO/cart-bus freeze beyond RAM (extends #422, RAM-scoped in v1.6).
  • Suggested opener: #449 (smallest, unblocks rest of Theme A). Theme A now ends inside v1.7 with the #461 default-flip + dead-path delete (no separate v2.0 milestone for it).
  • DMC-DMA accuracy (nessy #20 spillover): #480 ported Mesen's missing needDummyRead cycle into ProcessPendingDma (halt → dummy read → DMC read; the read & bytesRemaining decrement were landing one cycle early). Non-regressing (cpu_interrupts_v2 5/5 + apu_test 8/8 verified via nessy go.mod replace). #481 (epic) tracks the rest of the dmc_dma_during_read4 fix: the DMA-during-internal-register-read glitch. chippy side landed (v1.8 ADR 0011 D1): the DmaReadBus seam — an optional ReadDma(addr, DmaKind) Bus extension that tags the DMA loop's reads (DmaDmcRead/DmaSpriteRead/DmaDummyRead), cached at SetBus, fall back to plain Bus.Read when unimplemented. The CPU contributes only the tag; the open-bus latch + internal-register conflict formula + $4016/$4017 bit-deletion stay host-owned. Remaining (nessy): implement ReadDma with the conflict formula and converge against a MesenCE cycle-by-cycle reference.
  • 2 open issues outside the v1.7 set (#480 done-pending-merge, #481 epic). main clean at the v1.6.0 prep merge.
  • ADRs current through v1.8.0 (0001–0011). Pre-1.0 0.x tags fold into ADR 0001.

1. Project Overview

chippy is a Go-based TUI 6502 emulator with a Bubble Tea + Lipgloss source-level debugger. It targets ca65/cc65 toolchain output (.bin, .prg, .hex, .o via ld65) and aims to feel like an interactive debugger (gdb/lldb/nvim-dap style) for hobbyist 6502 development.

  • Module: github.com/nkane/chippy
  • Repo: https://github.com/nkane/chippy (public, primary branch main)
  • License: MIT (LICENSE in repo root)
  • Latest release: v1.6.0 (2026-06-16; brew install --cask nkane/tap/chippy). See docs/adr/0009-v1.6.0.md.
  • Go version: 1.26.2 in go.mod; CI uses stable

Vision

A debugger-first emulator. Run a binary from ca65, see source lines beside disassembly, set breakpoints/watchpoints with nvim-DAP-style sigils, step backwards, inspect memory, and integrate real peripherals (MMIO).


2. Architecture

Package layout

Public packages (importable by external modules incl. the future nessy repo — promoted out of internal/ in #349):

cpu/                    # 6502 / 65C02 / 2A03 core, opcode tables, addressing, interrupts
loader/                 # .bin/.prg/.hex/.o loaders; invokes ld65 when needed
symbols/                # cc65 .dbg parser (symbol table + source map)
peripheral/             # MMIO peripherals (TextOutput @ $F001, KeyboardInput @ $F004/$F005)
expr/                   # watch / condition expression compiler (shared by TUI + DAP)
trace/                  # -trace output parser for replay mode
dap/                    # Debug Adapter Protocol server
Private (module-internal):
cmd/chippy/             # chippy binary entry point
cmd/nessy{,-wasm,-record}/ # nessy binaries (move to the nessy repo at #351)
internal/tui/           # Bubble Tea model, panels, breakpoints, watchpoints (chippy-only)
internal/nes/           # NES PPU / APU / cart / dma / joypad (nessy-only; moves at #351)
example/                # ca65 sample programs + Makefile
roms/demos/             # nessy homebrew demo ROMs
docs/                   # mascot prompts, this file
.github/workflows/      # CI + release

Core types

  • cpu.CPU — registers, flag helpers, opcode dispatch, interrupt latches
  • Fields: A,X,Y,SP,P byte; PC uint16; Cycles uint64; Bus Bus; Variant Variant; Halted bool; extraCycles int; opcodes *[256]Instr; irqLine bool; nmiPending bool; nmiPrev bool
  • cpu.Bus interface — Read(addr uint16) byte; Write(addr uint16, v byte)
  • cpu.RAM — flat 64KB backing store
  • cpu.Instr{ Mode AddrMode; Cycles int; PageAdd bool; Exec func(*CPU, uint16, AddrMode) }
  • cpu.VariantVariantNMOS | VariantCMOS65C02; selects opcode table
  • tui.WBus — wraps cpu.Bus to capture memory access for watchpoints
  • tui.MemBP — memory breakpoint kinds (read / write / read+write)
  • symbols.Table / symbols.SourceMap — parsed cc65 .dbg data

Opcode tables

  • Opcodes [256]Instr — NMOS, authoritative (cpu/opcodes.go)
  • OpcodesCMOS [256]Instr — initialised from Opcodes then overridden (cpu/opcodes_cmos.go)
  • Illegals patched into NMOS table by opcodes_illegal.go (runs after CMOS init due to lex file order)
  • CPU dispatch goes through c.opcodes[op] so variant switching is free

Step semantics

  • Step() services interrupts at instruction boundary, THEN executes one opcode
  • NMI checked first (edge-triggered, always taken)
  • IRQ checked second (level-triggered, only when FlagI clear)
  • Servicing is 7 cycles, pushes PC+P (B clear), sets I, jumps to vector
  • Servicing un-halts the CPU
  • Returns total cycles including interrupt overhead + branch extras
  • c.Cycles is also advanced (same total)
  • c.extraCycles is the side channel for branches and CMOS BCD; reset each Step

BCD differences

  • NMOS: A and C reflect decimal arithmetic; N/V/Z reflect the parallel binary path (a real 6502 quirk)
  • CMOS: N/V/Z reflect the decimal result; +1 cycle penalty
  • Implementation: ADC/SBC dispatch via c.Variant to adcDecimalCMOS / sbcDecimalCMOS

Interrupts (PR #33, issue #10)

  • AssertIRQ() / ReleaseIRQ() — level-triggered, sets/clears irqLine
  • TriggerNMI() / DeassertNMI() — edge-triggered via nmiPrev rising-edge detect
  • Service routines push (P | FlagU) &^ FlagB (B clear), then set FlagI, read vector ($FFFA / $FFFE)
  • Wakes from Halted so a wait-loop can be interrupted by a peripheral

Memory routing (PR #34, issue #16)

Bus chain: CPU → tui.WBus → cpu.MMIO → cpu.RAM - cpu.Peripheral interface: Range() (lo, hi uint16); Read(uint16) byte; Write(uint16, byte) - cpu.MMIO wraps an inner Bus, dispatches to registered peripherals first - internal/peripheral.TextOutput — captures writes to $F001 into a buffer; rendered as a TUI panel - internal/peripheral.KeyboardInput — Apple-1-style data/status register pair ($F004/$F005); TUI pushes keypresses, CPU reads & status drains - Loader and reset-vector helpers write directly to ram, bypassing MMIO — peripherals must live at addresses no ROM will occupy

Execution trace (PR #36, issue #21; #57 issue #43)

  • cpu.Tracer interface — optional per-instruction hook on CPU.Step(). Methods: LogStep, LogInterrupt.
  • cpu.FileTracer — buffered file sink (64 KiB), Enable/Disable/Close/SetPath
  • CLI: -trace PATH; TUI: :trace PATH | :trace on | :trace off | :trace
  • Instruction lines: PC, opcode bytes, disasm, A/X/Y/P/SP, cumulative CYC.
  • Interrupt-entry lines: ---- NMI -> $FFFA (PC=$XXXX P=PP SP=SS CYC:N) emitted at the service boundary, before the 7-cycle push/vector-load, so a reader sees where the PC jump in the next instruction originated.

3. Conventions & Workflow

Branch & PR flow

  • One issue → feat/<short-name> branch off main
  • gh pr create with a body containing Closes #N
  • CI must go green (3-OS test matrix + lint + klaus)
  • gh pr merge N --squash --delete-branch
  • File a new GitHub issue for any work that gets deferred

Commits

  • Conventional Commits: feat:, fix:, docs:, ci:, test:, refactor:, chore:
  • These prefixes feed .goreleaser.yml's changelog grouping

GitHub CLI

  • Authenticated as nkane over SSH (key: ~/.ssh/id_ed25519_github)
  • workflow scope confirmed (can edit .github/workflows/)

Releases

  • Cut a tag vX.Y.Z.github/workflows/release.yml runs goreleaser
  • Binaries published to GitHub releases
  • Homebrew formula at nkane/homebrew-tap is auto-updated by goreleaser
  • Secret: HOMEBREW_TAP_GITHUB_TOKEN (PAT)
  • homebrew-core submission deferred until ~30 stars (issue #22)

Quality bars

  • go build ./... && go test ./... must stay green between increments
  • TUI must stay responsive — every Update key path returns tea.Cmd
  • Old persistence files (~/.chippy/state-<rom>.json) must keep loading

4. Progress

Shipped

  • v0.0.1 released; brew install works via tap
  • Release infra: .goreleaser.yml, .github/workflows/release.yml, MIT LICENSE, nkane/homebrew-tap

Closed issues

  • 1, #2, #3, #7, #8 (cycle audit), #9 (65C02), #10 (IRQ/NMI), #11–#15

Merged PRs of note

  • DMC-steal cycle parity fix (issue #493, v1.8, Refs not Closes): ProcessPendingDma's getCycle (the alignment-cycle decision that sets the 3-vs-4-cycle DMC steal length) read c.Cycles&1, but c.Cycles only advances at the instruction boundary (exec.go) — mid-instruction it is stale by instrCycles. Mesen's _cycleCount ticks every cycle, so its parity is the true CPU cycle. Fixed to (c.Cycles+instrCycles)&1. Opcode-fetch steals (instrCycles==0, e.g. cpu_interrupts_v2/apu_test) were already correct and unaffected; the bug only bit a steal landing on an operand read (instrCycles>0) — exactly the BIT $4015 case dma_2007_read needs, where stale parity froze the steal cost per iteration and killed the phase drift the calibration loop relies on. Regression test TestProcessPendingDma_StealParityUsesInstrCycles pins steal length to true-cycle parity. Still open in #493: full dma_2007_read convergence needs a MesenCE cycle-by-cycle reference + the nessy APU arm-cycle (DMC buffer-drain → SetNeedDmcDma); this fix is a prerequisite, not the whole story.
  • DMA-read open-bus seam — DmaReadBus (issue #481, v1.8, Refs not Closes): the chippy-side seam for the 2A03 DMA-during-internal-register-read glitch (dmc_dma_during_read4). ProcessPendingDma's five bus reads (cpu/dma.go) were untagged Bus.Read, so a host couldn't tell a DMC sample fetch from a normal read and couldn't apply the open-bus / internal-register conflict. New optional DmaReadBus{ ReadDma(addr, DmaKind) } Bus extension with a DmaKind tag (DmaDummyRead/DmaSpriteRead/DmaDmcRead); the DMC fetch routes through it as DmaDmcRead, sprites as DmaSpriteRead, halt/alignment dummies as DmaDummyRead. Assertion cached at SetBus (dmaBus field, mirroring busTicker) so the 256-read sprite loop pays no per-read type-assert; unimplemented hosts fall back to plain Bus.Read, byte-for-byte identical (proven by TestProcessPendingDma_PlainBusFallback comparing cycle count + routed sample). The CPU contributes only the tag — the open-bus latch + conflict formula + $4016/$4017 bit-deletion stay host-owned. Unit tests (cpu/dma_test.go) assert each read's tag + drain state. ADR docs/adr/0011-v1.8.0.md D1. DONE: the steal-timing follow-ups (#493/#494/#497, ADR D2/D3 — idle() halt poll + true-cycle getCycle) plus nessy's host dmaBus conflict formula converge dma_2007_read (reaches the same $E72F terminal as MesenCE), closing nessy #20.
  • 65816 full core — finish + close (issue #456, v1.7, Closes): the epic-closing PR — -cpu 65816 now runs in the TUI and disassembles 65816 mnemonics. Bus24From16 (cpu/cpu816.go) mirrors the 16-bit MMIO/watchpoint bus into bank 0, so the 65816 core executes bank-0 programs through the same RAM the panels render (reset vector resolves normally); cross-bank accesses originally aliased to bank 0 — fixed in v1.10.0 (#505): cpu.Banked24 gives banks 1-255 real storage (see §0 / ADR 0013). Disasm816 (cpu/disasm816.go) is a dedicated 256-entry decoder that reads the live M/X/E width to size immediates (LDA # is 2 or 3 bytes) and renders the new syntaxes (long $123456, [dp], sr,S, MVN src,dst); DisasmCPU/DisasmCPUWithSyms dispatch to it for VariantW65816. ADR docs/adr/0010-v1.7.0.md (Draft — the v1.7.0 release ADR, leading with the 65816 decisions D1–D7). README + -cpu help updated from "scaffold" to "full core." #456 done — 256/256 opcodes, Tom Harte-validated (254 via the harness in both modes, MVN/MVP via unit test).
  • 65816 full core — MVN/MVP (issue #456, v1.7, Refs not Closes): block move completes the 256-opcode execution core (blockMove in cpu/ctrl816.go, MVN $54 ascending / MVP $44 descending). Moves C+1 bytes from src bank:X to dst bank:Y using 16-bit X/Y/C regardless of width, sets DBR to the destination bank. chippy moves the whole block in one Step (7 cycles/byte) for debugger sanity rather than re-running the opcode per byte; the Tom Harte corpus caps each block-move test at ~100 cycles mid-instruction (a generator artifact), so MVN/MVP are covered by a dedicated unit test (cpu816_test.go) instead of the harness. All other 254 opcodes stay Harte-green in both modes. Remaining for #456: width-dependent disassembly + ADR, then close.
  • 65816 full core — chunk 3 (issue #456, v1.7, Refs not Closes): RMW + stack + control flow — 254 of 256 opcodes now implemented and Harte-green (only MVN/MVP $44/$54 remain). cpu/ctrl816.go adds the 24-bit-bus stack helpers, width-aware RMW kernels (ASL/LSR/ROL/ROR/INC/DEC accumulator + memory, TSB/TRB), branches (no page-cross penalty in native; +1 only when taken-and-crossing in emulation), and BRK/COP (mode-specific vectors $FFE6/$FFFE, $FFE4/$FFF4; push PBR in native, zero PBR in both, clear D + set I). Opcodes wired in step816: all shifts/rotates + INC/DEC/TSB/TRB, the full push/pull set (PHA/PLA/PHX/PHY/PLX/PLY/PHP/PLP/PHB/PLB/PHK/PHD/PLD/PEA/PEI/PER), JMP/JML/JSR/JSL/RTS/RTL/RTI + (abs)/(abs,X)/[abs] indirects, all branches incl. BRA/BRL, WDM/WAI/STP. Three quirks pinned against Harte: the new 16-bit stack instructions (PEA/PEI/PER/PHD/PLD/JSL/RTL) ignore the emulation page-1 wrap mid-instruction (full 16-bit SP, then SPHi reforced to $01) while the legacy ops (JSR/RTS/BRK/RTI/PHA…) page-wrap; JMP/JSR (abs,X) read the pointer high byte wrapping within the program bank (bankInc); and PEI's direct-page word read page-wraps in emulation (readDPWordWrap) where the (dp) addressing modes do not. TestHarte65816 green for all 254 implemented opcodes in BOTH modes (full files, uncached). Chunk 4 (final): MVN/MVP block move (Harte models them as a bounded fixed-cycle move) + width-dependent disassembly + ADR, then close the epic.
  • 65816 full core — chunk 2 (issue #456, v1.7, Refs not Closes): the width-aware memory engine + every new 65816 addressing mode. cpu/addr816.go resolves dp / dp,X/Y / abs / abs,X/Y / long / long,X / (dp) / (dp),Y / (dp,X) / [dp] / [dp],Y / sr,S / (sr,S),Y into 24-bit effective addresses, each returning the operand/pointer overhead cycles; readEA/writeEA transfer 1 or 2 bytes per the M/X width. Cycle accounting was reverse-engineered from the Harte corpus and is exact: +1 for 16-bit accumulator/index access (RMW pays it twice), +1 direct-page penalty when DL≠0, indexed-read +1 when the index is 16-bit or the add crosses a page, indexed write/RMW always pay the extra index cycle. cpu/arith816.go has the width-aware ALU (ORA/AND/EOR/CMP/BIT + 8-and-16-bit binary and decimal ADC/SBC — the 65816 takes no extra decimal cycle and keeps N/Z/V valid in BCD). Opcodes wired in step816: ORA/AND/EOR/ADC/SBC/STA/LDA/CMP (all modes), LDX/LDY/STX/STY/STZ, CPX/CPY, BIT. Two emulation-mode direct-page quirks were pinned against Harte: the DL=0 page-wrap applies only to the base offset (pointer bytes then increment flat 16-bit) — except the [dp],Y long pointer, whose three bytes do page-wrap (readDPLongWrap), while plain [dp] stays flat. TestHarte65816 green for all 122 chunk-2 opcodes in BOTH emulation and native, full 10000-case files (uncached). RMW (ASL/LSR/ROL/ROR/INC/DEC/TRB/TSB), stack ops, control flow (JMP/JSR/RTS/branches/BRK/COP), and MVN/MVP remain (still panic with their hex via the default arm).
  • 65816 full core — chunk 1 (issue #456, v1.7, Refs not Closes — epic stays open): the from-scratch 16-bit execution engine begins. step816 (cpu/exec816.go) is the 65816's own interpreter — Step() branches to it for VariantW65816 after interrupt/halt handling, fetching at PBR:PC through a new 24-bit bus (Bus24 interface, SetBus24, read24; the 65816's 16 MB space is independent of the 8-bit cores' 16-bit Bus). Register model (cpu/cpu816.go): mWide/xWide width predicates (M/X P-bits 5/4, native-only — locked set in emulation), A16/X16/Y16/SP16 + setA16/… accessors over the low+high byte fields, emulation forcing SP high byte $01. Chunk 1 covers the register/flag/transfer/immediate ops touching no data memory: NOP, flag set/clear (CLC…CLV), INX/DEX/INY/DEY, INC/DEC A, all transfers (TAX…TSC incl. TCD/TDC/TCS/TSC/XBA), immediate LDA/LDX/LDY/AND/ORA/EOR/CMP/CPX/CPY/BIT, and mode control XCE/SEP/REP — all width-aware. Harness (cpu/harte816_test.go, //go:build harte): pin dff67125, <op>.{e,n}.json split, 16-bit state + sparse 24-bit memory, validates final state + cycle count (TestHarte65816, green in BOTH emulation and native for the 41 chunk-1 opcodes; per-cycle pin trace is a later chunk). CI downloads just the 82 chunk-1 files. Unimplemented opcodes panic with their hex so the harness flags exactly what's left. Opcodes65816 table from phase 1 is now dead for execution (step816 is a direct switch) but still bound by bindTable. Chunk 2 (data-memory addressing: direct-page via D, absolute via DBR, long, [dp], stack-relative, MVN/MVP, full 256-op map, write24 for stores) and chunk 3 (native 16-bit completeness, #462) pending.
  • WASM playground drag-and-drop (issue #457, v1.7): the in-browser playground (web/ + cmd/chippy-wasm's JS API + the pages.yml Pages deploy, nesting web/ under /playground/) was already built + live at nkane.dev/chippy — demos dropdown, file picker, registers/disasm panes, keyboard→MMIO, text output. Closed #457's last done-when item: drag-and-drop a .bin/.prg/.hex anywhere on the page (web/chippy.js dragover/drop handlers → the existing loadUserFile path; body.dragging outline affordance in style.css), which the page copy already advertised. .dbg symbol drag-drop deferred — the wasm load() API doesn't expose symbol loading (a future enhancement). wasm builds; chippy.js valid; verification surface is the live Pages deploy.
  • BBR/BBS per-cycle bus trace (issue #475, v1.7): modeled the bit-test-and-branch 6-cycle bus pattern (branchBitTest in cpu/opcodes_cmos.go) so harteBusSkip65C02 is now empty — the 65C02 is per-cycle bus-exact for all 256 opcodes (completes #455). The sequence: zp-address operand read, zero-page bit-test read, a dummy write-back of that byte (65C02 RMW-style), relative operand read, then a dummy read of the branch target — which happens always, even when not taken, and uses the un-fixed (old-high-byte) address on a page cross; the carry-propagated target only reaches PC when the branch is actually taken (flat 6 cycles, no fixup cycle). Final state unchanged (dummy write is same-value), so TestHarte65C02 (state) still passes. Full uncapped TestHarte65C02BusTrace green; perfgate green.
  • 65816 variant — emulation-mode scaffold (issue #456 phase 1, v1.7): VariantW65816 + the register model (16-bit-capable B/XH/YH/SPHi high bytes, D direct-page, DBR/PBR banks, E emulation flag on the CPU struct), reset to emulation mode (E=1, stack page $01), bindTableOpcodes65816, -cpu 65816 CLI. Opcodes65816 (cpu/opcodes_w65816.go, init sorts last so it copies the built OpcodesCMOS) = the 65C02 base for emulation-mode shared ops (low-byte ops leave the accumulator high byte intact, matching the 65816) + the mode-control opcodes XCE (swap C/E), SEP/REP (set/clear P bits, M/X locked while E=1). Hand-tested (cpu816_test.go): reset mode, base-op high-byte preservation, XCE toggling E↔C, SEP/REP, emulation M/X lock. NOT a complete 65816 — the spike confirmed the full core is a dedicated epic (see Open issues / #456): the Tom Harte 65816 corpus is 512 files (256 ops × emulation/native) with 16-bit state + a 24-bit-address/pin-flag cycle format needing a bespoke harness; the $x7/$xF + CMOS-NOP slots are real 65816 instructions (long/[dp]/stack-rel/MVN/MVP/bank-transfer); native 16-bit M/X widths + new addressing modes are unimplemented. Full gate + perfgate green (hot path unchanged for other variants).
  • MMIO/cart-bus freeze (issue #463, v1.7): extended the debugger freeze/write-suppress facility (RAM.Freeze, #422, RAM-only) to the bus level — MMIO.Freeze/Unfreeze/Frozen/FrozenAddrs (cpu/peripheral.go). A CPU write to a peripheral never reaches RAM (MMIO intercepts it), so RAM-level freeze can't hold peripheral/cart values; the guard now lives in MMIO.Write (single len check, suppresses frozen addrs). Freeze writes the value through dispatchWrite once (lands in the peripheral or Inner) then adds to the set; works for peripheral- and RAM-mapped addresses. Zero hot-path cost when nothing frozen (perfgate green). RAM.Freeze stays for direct-RAM contexts. Consumed by hosts (nessy) needing peripheral/cart freeze. Test: freeze a peripheral addr + a RAM addr, confirm suppress + hold + unfreeze (peripheral_test.go).
  • Tom Harte 65C02 per-cycle bus trace (issue #455, v1.7): added TestHarte65C02BusTrace (cpu/harte_test.go) — the CMOS sibling of TestHarte6502BusTrace (#428), running the wdc65c02 set through the per-cycle busRecorder against VariantCMOS65C02. It surfaced that the per-cycle interleave was NMOS-modeled; ~64 opcodes diverged across the documented NMOS-vs-CMOS dummy-cycle classes, all now fixed bus-exact: (1) RMW dummy is a READ on 65C02, not a write-back (rmwDummy variant-gated; TRB/TSB/RMB/SMB now call it); (2) indexed page-cross dummy re-reads the last instruction byte c.PC-1 instead of the un-fixed address (indexedDummyAddr); (3) JMP (abs)/(abs,X) take 6 cycles — lo@ptr, dummy hi@the NMOS-wrap address, correct hi@ptr+1 (the extra cycle fixes the page bug); (4) push/pull (PHX/PHY/PLX/PLY) emit the PHA/PLA dummy cycles; (5) the WDC 3-byte NOPs ($5C/$DC/$FC) re-read the hi operand byte (opNOPAbs65C02) instead of dereferencing. Skip list (harteBusSkip65C02) is just BBR/BBS (16 ops — quirky 6-cycle dummy-write-back+branch-target pattern; state+count validated by TestHarte65C02); decimal ADC/SBC skipped per-case (per-cycle path doesn't emit the BCD-correction cycle's bus access). Runs in CI's "tom harte tests" (-run TestHarte65C02 matches by prefix). NMOS bus trace + both state suites + perfgate unaffected (all CMOS changes variant-gated).
  • DAP setVariable on memory + array children (issue #454, v1.7): extended handleSetVariable (dap/vars.go) beyond registers/flags. A Globals scalar (refGlobals + symbol name) resolves via globalAddr (syms.LookupName) and pokes the byte; an array child (a dynamic varRefs ref + [i] name, parsed by parseArrayIndex) pokes Addr+i. Both write through s.ram (bypass MMIO — a debugger poke, not a program access) and are refused while running. Completes the read/write story for the #410 Globals scope so an editor can edit a RAM cell or array element in the Variables pane. Tests: scalar + array-child write in vars_test.go.
  • DAP-only control: server-driven local run (issue #471, v1.7): the DAP server now owns run + step enforcement. New Server.RunBudget(maxSteps, step, stopAt) -> (stopped, reason, log) (dap/steps.go) advances the CPU on the TUI goroutine — step is the TUI's own m.step, so the rewind ring keeps filling — while enforcing breakpoints / data breakpoints / halt / BRK + an optional caller predicate. All run paths route through it: free-run (r), step-×16 (S), step-over (n, predicate=return-PC), run-to-line (f, predicate=line-change), via the new Source.RunBudget (LocalSource→its inproc server; RemoteSource no-op — wire uses async continue). Breakpoints + watchpoints forwarded at run start via Source.SetBreakpoints (LocalSource now real, not a no-op) + new Source.SetDataBreakpoints (setInstructionBreakpoints/setDataBreakpoints, #453); DataBreakpoint gained an additive logMessage for watchpoint logpoint parity. Removed the TUI's shouldBreakAt; processMemHits/WBus are now vestigial (ring-buffered, no leak; the access hook enforces watchpoints). Chose synchronous RunBudget over async continue+events: the inproc dispatch self-locks cpuMu, so an async run goroutine would deadlock with m.step's lock + the :dap co-running server (non-reentrant) — sync runs on the TUI goroutine (no goroutine race, TargetHz preserved). Zero per-access cost when no watchpoints set (data-bp hook armed only then). Rich TUI rewind kept as the local engine exception. Deferred (low value): single-step/mem-edit stay direct — routing the run's per-step through stepIn would regress run perf, and mem-edit via writeMemory bypasses MMIO (a behavior change); full WBus unwiring.
  • TUI render path fully DAP-sourced (issue #461, v1.7): cleared the last direct cpu/RAM reads from the render + navigation paths after Theme A. Stack panel raw byte/run rows read a DAP-sourced stack-page snapshot (StackSnapshot.Page, fetched via Source.ReadMemory($0100,256) in syncStack) through the new stackByte accessor; disasmScroll moves its anchor by stepping the DAP disasm snapshot (m.Disasm.Lines) instead of cpu.WalkBack/cpu.DisasmWithSyms. Deleted the unused walkBack wrapper + Model.isDataAddr. All five panels + nav are DAP-sourced — zero direct cpu.CPU/cpu.RAM in any render/nav path. The control path (run/step/bp/mem-edit/watchpoints) still drives the core directly (the debugger engine, not render); routing it through DAP — the "DAP-only" control flip — was split to #471 after the #461 spike surfaced its v2.0-scale tradeoffs (sync RunBudget vs async continue; single-owner cpuMu vs the non-reentrant m.step lock + :dap co-running server; a free-run-rewind conflict). Rich TUI rewind kept as the local engine exception.
  • DAP data breakpoints / memory watchpoints (issue #453, v1.7 Theme B): added dataBreakpointInfo + setDataBreakpoints + supportsDataBreakpoints capability (dap/databreakpoints.go). dataBreakpointInfo resolves a hex/decimal address or loaded symbol to a $XXXX dataId + read/write/readWrite access types; setDataBreakpoints replaces the watchpoint set (map[uint16]*dataBP, guarded by bpMu). Enforcement reuses the free-run access hook: installDirtyHook's chained AccessRead/AccessWrite hook (issue #421/#440) flags a matching watched access into dataBPPending; runLoopIter evaluates the bp's bpMeta (condition/hit/log reused from the instruction-bp path) and stops with reason data breakpoint after the instruction completes. Zero hot-path cost when none set (len guard); dataBPPending reset each run start. Transcript goldens regenerated. Prerequisite for the #461 watchpoint-over-DAP step (chosen sequence: #453 → full #461). Next: #461 (full DAP-only flip, incl. routing TUI MemBPs through setDataBreakpoints).
  • Disassembly panel via DAP disassemble (issue #452, v1.7 Theme A — completes Theme A's read panels): fifth and last panel migrated off direct core access, after Registers (#394), Stack (#449), Flags (#450), Memory (#451). New Source.Disassemble(anchor,above,below) + fetchDisasm + m.syncDisasm() + DisasmSnapshot (internal/tui/disasm.go, disasmCtx=48). disasmView renders text/symbol from the snapshot (applying its own PC/bp markers + styling); the old disasmAddrsAround/cachedDisasmAddrs/disasmCacheEntry render path is deleted. The DAP disassemble handler is now data-range-aware — renders .byte $XX (step 1) when srcMap.IsData(addr) — so it matches the TUI for any client. Both Sources own an inproc DAP server now: LocalSource on the live core, RemoteSource on its mirror (new mirrorServer/mirrorClient), so remote disasm follows the streamed PC with no per-tick wire round-trip (mirror current via chippy-state regs + #440 dirtyRanges; same NMOS-mirror variant behavior as before). Symbols/source-map push into both via a new symbolSink interface (generalizing #449's *LocalSource-only path). Residual for #461: disasmScroll's anchor nav still uses cpu.WalkBack/cpu.DisasmWithSyms (navigation, not render). All five panels now DAP-sourceddocs/dap-tui-migration.md marked complete; #461 flips the default + deletes the dead direct path. Next: #453/#454 (DAP parity) or #461 (flip).
  • Memory panel via DAP readMemory + dirtyRanges (issue #451, v1.7 Theme A): fourth panel migrated off direct cpu.CPU/RAM access, after Registers (#394), Stack (#449), Flags (#450). New Source.ReadMemory(addr,count) + fetchMem + m.syncMem() + a window snapshot m.MemView/MemViewBase (internal/tui/mem.go, memWindow=0x400). memView renders m.memByte(a) from the snapshot instead of m.RAM.Read. LocalSource.ReadMemory issues an inproc readMemory (on the same RAM, but over the protocol — closes the local direct-read gap); RemoteSource.ReadMemory serves the window from the DAP-fed RAM mirror (reconciled by RefreshMemory on stop, updated by #440 dirtyRanges during a run) so a remote free-run needs no per-frame round-trip — the chippy-state handler calls refreshMemWindow after applying the deltas. m.MemView refreshes alongside the other snapshots (seed/tick/post-key/stopped), skipped during a remote run. The memory editor's write path (memWrite → WBus/CPU bus → core) is untouched; remote writes are #454. Next: disassembly panel → disassemble (#452).
  • Flags panel via DAP Flags scope (issue #450, v1.7 Theme A): third panel migrated off direct cpu.CPU access onto DAP, after Registers (#394) and Stack (#449). New FlagsSnapshot (eight bools) + fetchFlags + m.syncFlags() (internal/tui/flags.go); Source.Flags() for LocalSource (inproc) and RemoteSource (wire), reading the existing Flags scope (variables ref=2 — server decomposes P into N/V/U/B/D/I/Z/C "0"/"1" bits). flagsView renders m.Flags instead of bit-testing m.CPU.P & cpu.FlagN. m.Flags refreshes at the same points as m.Regs/m.Stack (seed/tick/post-key/stopped), skipped during a remote free-run. During a remote run the chippy-state event (#395) already carries raw P, so the handler decomposes it client-side via flagsFromP — Flags panel stays as live as Registers without a per-frame round-trip; the Flags scope is the authoritative source on stop. No new DAP fields (Flags scope predates this). Next: memory panel → readMemory + dirtyRanges (#451).
  • Stack panel via DAP stackTrace (issue #449, v1.7 Theme A): second panel migrated off direct cpu.CPU/RAM access onto DAP, following the #394 Registers template (docs/dap-tui-migration.md). New StackSnapshot/stackFrame + fetchStack + m.syncStack() (internal/tui/stack.go); Source.Stack() implemented for LocalSource (inproc server) and RemoteSource (wire). m.Stack refreshed at the same points as m.Regs (seed, per-tick, post-key, stopped event); skipped during a remote free-run (server owns the CPU; stopped reconciles). The panel keeps its hardware-stack-page layout (frame ranges + collapsed runs + T raw-toggle) rather than collapsing to a flat call list — so the DAP StackFrame gained two additive chippy-extension fields: chippyStackAddr ($01XX slot of each pushed return pair) and chippyCallee (symbol at the JSR target, distinct from Name = symbol at the return address). cpu.DetectStackFrame + symbol/source-map lookups now run server-side; stackEntries only positions the snapshot frames over the page and renders gaps as runs (raw bytes still from the DAP-fed RAM mirror — #451 will formalize). Local mode pushes syms+srcMap into its inproc server via new LocalSource.SetSymbolsdap.Server.SetSymbols (symbols load after New). The DetectStackFrame heuristic tests moved from internal/tui to cpu/stackframe_test.go (where the function lives). Next: flags panel → variables (#450).
  • chippy-state dirtyRanges — stream changed memory during free-run (issue #440, v1.6): the chippy-state event's reserved-empty dirtyRanges now carries the memory written since the previous emit, coalesced into [start,end) spans with current bytes inline (MemRange.Data, base64). The run loop arms an AccessWrite hook (cpu.SetAccessHook) that stamps a 64 KiB dirty bitmap bounded by dirtyLo/dirtyHi; sendChippyState flushes via flushDirtyRanges() under cpuMu and clears. New cpu.AccessHook() getter lets the server chain in front of a host's hook (#433) and restore it on stop — zero per-write cost when not running. TUI ChippyStateEvent handler applies each span to m.RAM (Load(Start, Data)) so memory/disasm panels update live during a remote run; the stopped full-RAM RefreshMemory stays as the authoritative final reconcile. start+len(Data) authoritative (span at $FFFF wraps End). Closes the #438 v1.6 epic.
  • DAP array/struct children in variables (issue #410, v1.6): added a Globals scope (refGlobals=3, only advertised when a .dbg is loaded) to dap/vars.go. globalsVariables() enumerates data symbols via a new symbols.Table.Symbols() (sorted []Sym{Name,Addr,Size}); a symbol is "data" if cc65 sized it (size>0) or it falls in a SourceMap data range, and code labels (addrs with a PCToSrc entry) are filtered out. A size>1 symbol becomes an expandable array: a dynamic variablesReference (allocated from refDynamicBase=1000, rebuilt each Globals fetch into Server.varRefs map[int]arrayRef) whose children are indexed byte rows [0..size-1], paged via the new VariablesArguments.Start/Count. New Variable.IndexedVariables hint + supportsVariablePaging capability (transcript goldens regenerated with -update). Caps: maxGlobals=1024, maxArrayChildren=4096. Builds on the array-watch model from #408; the TUI struct overlay is the sibling #409.
  • Manual struct overlay watch (issue #409, v1.6): :watch X as {hp:byte, x:word, y:word} expands into named member rows read at X+offset. cc65 .dbg carries no struct member layout (V2.18 collapses all csym types to void — see #390), so the layout is user-declared. New Watch.Fields []WatchField{Name,Offset,Width} (additive, omitempty → stays state schema v1; golden state-v1.json + TestLoadState_GoldenV1 updated with a struct watch). Members are name:byte|word; offsets auto-advance by width, override with name@N:width (decimal/$hex) for padded/union layouts. Renderer writeStructWatch mirrors writeArrayWatch ({N} header + indented member rows); watchView/watchRowCount branch on len(Fields)>0 before the array path. Parsing helpers in prompt.go (parseStructSpec/parseStructField/parseWidthToken/parseOffset, capped at maxStructFields=32). Builds on the array-watch machinery from #408.
  • Tom Harte 65C02 ProcessorTests (issue #426, v1.6): TestHarte65C02 runs the wdc65c02 set against VariantCMOS65C02, reusing the harte harness via a harteSuite descriptor (data subpath + variant + skip list + per-case filter). Shook out real CMOS accuracy bugs, all fixed: (1) ADC decimal V flag was computed from the post-+0x60 result — moved to the pre-correction partial sum (silicon order; N/Z stay from the final decimal result). (2) ASL/ROL/LSR/ROR abs,X were a flat 7 cycles (NMOS) — 65C02 optimized them to 6 (+1 only on page cross); INC/DEC abs,X correctly stay 7. (3) JMP (abs) indirect was 5 cycles — 65C02 takes 6. (4) $5C NOP was 8 cycles — real W65C02S is 4 (8 is the '816). (5) BBR/BBS carried a taken/page-cross branch penalty — silicon is a flat 6. Skips: WAI/STP (halts, not single-step state); invalid-BCD decimal ADC/SBC cases dropped per-case (effective operand resolved via the production resolve()) — chippy's valid-BCD decimal matches silicon on every case, only the documented-undefined invalid-BCD inputs diverge. CI "tom harte tests" job extended with a cached wdc65c02 download + -run TestHarte65C02. cmos_cycles_test.go ($5C, BBR/BBS) updated to the corrected counts.
  • Per-cycle bus-trace quirks → 238/238 6502 bus-exact (issue #428, v1.6): fixed the three nesCycle-path divergences TestHarte6502BusTrace skip-listed. (1) Taken page-crossing branches — the fix-up cycle now dummy-reads the pre-fixup address (oldPCH | newPCL) ((c.PC & 0xFF00) | (addr & 0xFF) in branch()), matching silicon, instead of the already-corrected target. (2) JSR $20 — added a dedicated JSRABS addressing mode: resolve fetches only the low operand byte and leaves PC at the high byte; opJSR does the internal stack cycle, pushes the return address, then reads the high byte. This puts the high-byte fetch last (silicon order) AND naturally reproduces the #427 stack/operand overlap quirk — so the NMOS-only re-read special-case AND the nesCycle branch both collapse into one unified path. (3) RTS $60 — the final PC-increment dummy now reads at the pulled PC, not pulled+1. harteBusSkip is now empty. Cycle counts/final state unchanged; klaus/lorenz/state-tests all still green.
  • ARR ($6B) decimal-mode fix (issue #424, v1.6 first item): added the decimal path to opARR (cpu/opcodes_illegal.go) per the 64doc/no-more-secrets algorithm — N/Z/V from the binary rotate of A & imm; A + C take a per-nibble BCD fixup (low +6 when (t&0x0F)+(t&0x01)>5, high +6 & C set when (t&0xF0)+(t&0x10)>0x50). Subtle bug caught en route: the high-nibble sum overflows a byte ($F0+$10=256→0), so the comparison must use int. Now 10000/10000 Tom Harte 6b cases pass → removed 0x6B from harteSkip, and re-added the arrb probe to the Wolfgang Lorenz suite (65 dumps, all green). nestest/Klaus unaffected.
  • Host hooks for NES breakpoints + step granularity (issue #433, closes epic #419): two extension points so nessy expresses NES-aware breakpoints + step granularity through the chippy DAP server. (1) expr.HostVarResolverCompile(src, syms, host...) variadic; identEval consults the host resolver (after CPU regs/flags, before symbols) so a host identifier like scanline resolves to a getter read at eval time (scanline == 30 works against PPU state the 6502 can't see). Server.SetHostVars threads it into condition compilation (bpmeta) + evaluate. (2) Server.SetStopPredicate(func() bool) — checked once per runLoopIter post-step under cpuMu; when true the run stops with reason step, letting a host build run-to-NMI / step-scanline / step-frame on the server's pause/ownership model instead of a side-loop. Both nil by default + guarded by cpuMu. Pairs with #416's CustomRequestHandler (request transport), #421 (access heatmap), #422 (freeze). Tests: expr host-var resolution (live getter), evaluate over a host var, predicate stops a continue at N steps (filtering the attach entry stop). Docs: docs/dap.md host-hooks table + api.md. Epic #419 complete (4/4 hooks).
  • RAM address freeze / write-suppress (issue #422, epic #419 host-debug-hooks): RAM.Freeze(addr, value) sets a CPU-bus byte then suppresses all subsequent CPU writes to it so the value holds across frames (debugger freeze / cheats); Unfreeze/Frozen/FrozenAddrs round it out. RAM.Write guards at the top — if len(r.frozen) != 0 && r.frozen[addr] → return (suppressed before the shadow-capture + store). Zero-cost when nothing frozen: a single len check; perfgate green. The freeze set writes Data directly (no rewind epoch — it's a debugger action, not a program write). Scope is RAM (MMIO/cart freeze deferred). Consumed by nessy#32's memory-viewer freeze. Second non-DAP hook in epic #419 (after #421); next: #433 NES breakpoints.
  • CPU bus access-tracking hook (issue #421, epic #419 host-debug-hooks): opt-in CPU.SetAccessHook(func(addr uint16, kind AccessKind)) so a downstream emulator (nessy) can build a Mesen-style memory access heatmap without forking the core (cpu/access.go). AccessKind = AccessRead/AccessWrite/AccessExec. Refactored read into busRead(addr, kind); added fetch (opcode → AccessExec); readAccessRead, writeAccessWrite, dummy idleAccessRead. Step fetches the opcode via c.fetch so the opcode byte stamps exec, operands/data stamp read (matches the issue spec). chippy records nothing itself — the host's hook owns the recency state. Zero hot-path cost when unset: a single nil-check per access; perfgate green (BenchmarkStep_NMOS 9.7 ns/op vs 25 ns budget). First non-DAP hook in epic #419 (after #416's custom-request handler); next: #422 address freeze, #433 NES breakpoints. Consumed by nessy#32.
  • DAP live-state streaming — chippy-state event (issue #395, v1.5.0 Theme A, closes epic #402): custom server→client event pushed during a free-run so panels refresh without per-frame variables polling. ChippyStateBody (regs A/X/Y/SP/P/PC/Cycles/Halted + a reserved dirtyRanges) emitted from runLoop throttled to ≤60 Hz (chippyStateInterval = time.Second/60); sendChippyState snapshots regs under cpuMu then sends. The TUI's dapEventMsg handler on dap.ChippyStateEvent remarshals the body into m.Regs (raw numeric values, not $XX strings — both ends are chippy). Crucially syncRegs now skips polling when remote+running (m.Running && m.Source.Attached()) so a remote run is purely event-driven; local mode (no server run loop) keeps polling the sub-µs inproc client. Additive per the Mesen/DAP-extension convention — standard clients ignore unknown events. Schema in docs/dap.md. Tests: stream fires with advancing Cycles, throttle caps near 60 Hz (not per-instruction).
  • Registers panel migrated to DAP (issue #394, v1.5.0 Theme A): PoC for the TUI-via-DAP-only direction — the Registers panel now renders from a DAP-sourced snapshot, never direct cpu.CPU field access. RegSnapshot + fetchRegs(dapRequester) (internal/tui/regs.go) do one variables round-trip (the server's Registers scope already returns A/X/Y/SP/PC/P/Cycles); remarshal makes it transport-agnostic (wire JSON body or inproc Go struct). New Source.Registers() (RegSnapshot, error): LocalSource owns an in-process DAP server (the #393 inproc transport, ~0.34 µs) attached to the same CPU/RAM, so local mode reads through DAP too; RemoteSource reuses its attach client. m.syncRegs() refreshes m.Regs in the Update loop (once per tick + after key actions + on seed in New/WithSource); regsView renders the cache so Bubble Tea View stays pure. Remote -dap-attach unchanged. Pattern documented in docs/dap-tui-migration.md for the v1.7 panel-by-panel migration (stack panel next via stackTrace). The in-process direct path is now dead code for this panel but kept until #461 (v1.7) flips the default.
  • In-process + unix-socket DAP transports (issue #393, v1.5.0 Theme A): the DAP server was already io.Reader/Writer-based, so unix is just another net.Conn (-dap unix:PATH via a shared acceptOne helper). The new piece is the zero-marshal inproc transport (dap/inproc.go): all server sends funnel through writeJSON, which now checks an optional sink func(any) — when set, Response/Event structs go straight to the in-process client instead of being JSON-marshalled to the wire. NewInprocServer() (*Server, *InprocClient); InprocClient.Request submits a Request straight to s.dispatch and reads the captured Response. Nil-args requests + all responses round-trip with zero serialization; typed args incur one marshal (handlers still parse Arguments as json.RawMessage). Bench (stepIn, NOP sled): inproc ~0.34 µs, unix ~30 µs — both inside the issue's <1 µs / <100 µs targets, ~90× apart. -dap inproc runs a loopback self-check (a standalone inproc server has no external client; the real consumer is the future embedded TUI, #394). newServer() factored out of NewServer for the shared map init. No panel migration yet (that's #394). Docs: docs/dap.md transport table + benchmarks.
  • Visual6502-equivalent per-cycle bus-trace validation (issue #400, v1.5.0 CPU-coverage, stretch): reframed from "diff 3-5 hand-curated Visual6502 traces" to comparing chippy's per-cycle bus activity against Tom Harte's cycles field (already downloaded for #401) — same goal (highest-fidelity per-cycle correctness probe), ~100× the coverage. TestHarte6502BusTrace in cpu/harte_test.go runs a busRecorder (logs every Bus.Read/Write as {addr,val,rw}) against the per-cycle interleave. Key trick: the nesCycle path routes every access (incl. dummy cycles) through c.Bus, but VariantNES disables decimal — so the harness keeps VariantNMOS (decimal intact) and force-enables c.nesCycle for the single-instruction step. Initially 228/238 opcodes matched bus-exact; the 10 divergent (8 page-crossing branches + JSR/RTS) were skip-listed and closed by #428 → now 238/238 bus-exact with an empty harteBusSkip. Runs in the existing "tom harte tests" CI job (-run TestHarte6502 prefix-matches both tests).
  • Tom Harte ProcessorTests — 6502 (issue #401, v1.5.0 CPU-coverage): per-opcode fuzz validation — ~10k randomized initial→final cases per opcode (regs + memory + cycle count). cpu/harte_test.go (build-tag harte, new "tom harte tests" CI job) runs all 256 opcodes via t.Run+t.Parallel; data is the pinned commit bb117564, not vendored (~1 GB) — CHIPPY_HARTE_DIR points at a local 6502/v1, else per-opcode download→user-cache. CI caches the data via actions/cache keyed on the commit (parallel curl on miss). CHIPPY_HARTE_MAX_CASES caps cases for quick runs. Bus-trace comparison out of scope (cycle COUNT compared). Full run = 19/256 opcodes failing, resolved cleanly: 12 JAM/KIL ($x2 — chippy NOP-stubs; skip-listed) + 6 unstable illegals (SHA/SHX/SHY/TAS 93/9b/9c/9e/9f, ARR-decimal 6b#424 — magic-constant, skip-listed) + 1 real stable bug fixed: JSR $20 stack/operand-overlap quirk. The 6502 fetches the high operand byte after pushing the return address, so when the stack overlaps the operand the push is observed by that fetch (target $0155 not $1355). chippy's resolve(ABS) read both bytes up front; opJSR re-read the high byte at PC-1 post-push on the non-nesCycle path (cycle count table-driven there, so unchanged). (Superseded by #428: JSR now uses a dedicated JSRABS mode that defers the high-byte fetch in both paths, so the special-case is gone.) nestest/Klaus/Lorenz unaffected. Every stable opcode now passes. 65C02 set deferred to #426 (CMOS variant + own data).
  • Wolfgang Lorenz C64 suite — CPU subset (issue #399, v1.5.0 CPU-coverage): vendored 64 pure-CPU probes from Lorenz testsuite-2.15 (decimal adca/sbca/…, stable illegals asoa/rlaa/lsea/rraa/…, core opcodes, flags, branches, stack, jumps, BRK/RTI/RTS) under cpu/testdata/lorenz/ (extension-less .prg dumps, ~256 KB). cpu/lorenz_test.go (build-tag lorenz, new "wolfgang lorenz suite" CI job) runs each standalone via a minimal KERNAL-trap harness ported from floooh/chips-test m6502-wltest.c: load the dump at its 2-byte header addr, seed RAM ($0002/$A002/$A003/$FFFE-F/$01FE-F) + a 19-byte IRQ/BRK shim at $FF48, enter at $0801, trap $FFD2 CHROUT (print PETSCII + RTS), $E16F LOAD (chained → passed), $FFE4 GETIN (error keyscan → failed), $8000/$A474 (done). Standalone-per-test so a failure names the exact probe + captures its output. C64-hardware tests (CIA/SID/VIC, NMI/IRQ sourcing, banking) excluded — Klaus #404 covers IRQ/NMI/BRK. One real bug surfaced: arrb (ARR $6B in decimal mode) — chippy's binary-only ARR gives A=$2A vs $80; omitted + tracked in #424. All 64 included pass against VariantNMOS.
  • Klaus interrupt test — ca65 port (issue #404, v1.5.0 CPU-coverage): wired Klaus Dormann's 6502_interrupt_test (IRQ/NMI/BRK). Upstream ships as65 source only — no prebuilt bin, and ca65 can't parse as65 syntax — so it's ported to ca65 (cpu/testdata/6502_interrupt_test.ca65 + interrupt_test.cfg, mirroring amb5l's conversion conventions: .macro/.endmacro, .if/.endif, !=<>, org→segment+.org, .feature labels_without_colons). CI assembles it with cc65 and points CHIPPY_INTERRUPT_BIN at the result (the bin is a build artifact, gitignored; the ca65 source is the provenance). The test (cpu/interrupt_rom_test.go, build-tag klaus) drives the feedback port the test needs: $BFFC, config I_drive=1 open-collector / I_ddr=0 / IRQ_bit=0 / NMI_bit=1 → the I_set macro sets a bit to assert, so the lines are active-high (IRQ level on bit 0, NMI edge on the 0→1 of bit 1); the harness polls the port after each instruction and asserts via AssertIRQ/ReleaseIRQ/TriggerNMI. The ROM runs twice and ends in JMP * at a success site ($06F5/$070F/$072C); any other self-loop is a failing trap. Passes in 1050 instructions against VariantNMOS. Gotchas hit during the port: the linker cfg needs offset=$0200/$0400 (without them ld65 packed CODE after ZEROPAGE while .org set labels to $0400 → PC ran into $FF fill); polarity is active-high not active-low (the no-NMI macro variant I first read was the opposite). Closes the Klaus coverage gap from #397.
  • AllSuiteA CPU smoke ROM (issue #398, v1.5.0 CPU-coverage): wired Frank Kingswood's compact ~1.5 KB AllSuiteA.bin into cpu/allsuite_test.go (build-tag klaus, reuses httpDownload/verifySHA256/sha256SumOf from klaus_test.go). SHA-pinned from the pmonta/FPGA-netlist-tools mirror, downloaded-on-demand + cached (not vendored), CHIPPY_ALLSUITE_BIN override. Loads + enters at $4000; both pass and fail end in JMP * at $45C0, so the run loop detects the self-loop then reads the result byte — $0210 == $FF = pass (anything else is the failing test number). Runs against VariantNMOS, converges in 613 instructions. CI's klaus job now runs -run 'TestKlaus|TestAllSuiteA' (renamed "klaus + allsuite cpu tests"). First of the v1.5.0 CPU-corpus issues (#398-#401, #404 under epic #402); complements Klaus by catching gross opcode/addressing regressions in <1 ms.
  • Remove VS Code extension (post-v1.3.0): deleted extension/vscode-chippy/ and its pipeline — the vscode-extension publish job in release.yml, the vscode-ext CI job, and the npm Dependabot entry. Microsoft blocked the marketplace listing, so the extension is pulled until that's resolved (git history preserves it for revival). DAP editor integration is unaffected: VS Code / Cursor still drive chippy via a chippy -dap stdio launch config (examples/dap/launch.json), as documented in docs/editors.md. The trigger was the v1.3.0 release's vscode-publish failure — dependabot #406 bumped @types/vscode above engines.vscode, which vsce rejects (fixed in #415 before removal). The v1.3.0 binary release itself shipped fine (34 signed assets).
  • Deep rewind via keyframes (issue #392, v1.3.0): the per-step SnapshotRing only reaches back its capacity (256 steps) — fine for "step back one", useless for "rewind into the last few million steps". Added keyframe-based deep rewind: cpu.KeyframeRing holds periodic full-RAM snapshots (CPU.SnapshotFull captures all 256 pages; one keyframe every keyframeInterval=4096 steps), and :rewind N reconstructs any earlier step by restoring the nearest keyframe ≤ target (KeyframeRing.Nearest) and replaying forward to the exact step (rewindToStepstepReplay loop under a replayingRewind guard so replay doesn't re-capture keyframes). Small jumps still pop the fine ring exactly. :rewind-budget MB resizes the ring (cap = budget/64KiB); reach = cap × interval, shown in the status bar as deep:<reach>@<budget>. Note — the issue's own numbers are mutually inconsistent: full 64 KiB keyframes every 1k steps can't reach 10M under 256 MiB (that's ~4M). Used interval 4096 instead so 256 MiB reaches ~16.7M while forward-replay stays ≤4096 instructions (benchmarked 1.3 ms incl. replay, vs the 100 ms acceptance). Memory is a cap not a reservation — the ring only fills to the run length; the old "fixed 256-entry ring" already sat at ≤16 MiB so the issue's "ring grows" framing was off. A step-0 keyframe is seeded on the first step so sub-interval targets are reachable. StepCount tracks position; < and reset keep it in sync. Determinism caveat: forward replay assumes deterministic execution between keyframes (buffered keyboard input is snapshotted, so it replays). Deltas-from-previous-keyframe compression is a future optimisation. No state-format change (StepCount/keyframes are ephemeral). cpu ring logic unit-tested apart from the TUI; deep-rewind exactness verified byte-for-byte against a RAM-mutating loop ROM.
  • Trace replay — search / jump-to-cycle / diff (issue #391, v1.3.0): four navigation features on top of -trace-replay (issue #64's playback). (1) :find EXPR / :rfind EXPR — jump to the next/previous frame matching an expression over the frame's registers/flags, reusing the breakpoint-condition expr grammar against a scratch CPU loaded per frame (framePredicate). A bare = is normalised to == (normalizeFindExpr) so :find PC=$8042 works as users type it; bare :find repeats the last expression to sweep matches. (2) :cycle NReplay.SeekCycle binary-searches the monotonic cycle column (O(log N) on a 1M-frame trace). (3) -diff PATH — loads a second trace; trace.Diff walks both by index and returns the first Frame.Equal mismatch (or a length-mismatch divergence at the shorter trace's end) as trace.Divergence{Index,Cycle,Found}, computed eagerly in WithReplayDiff and surfaced in the status line. (4) d / Dd toggles a side-by-side diff overlay (diffModal, double-bordered like the help modal) centred on the primary cursor with mismatched frames in red + a gutter at the divergence; D jumps both cursors there. Pure-trace logic (SeekCycle/FindFunc/Diff/Frame.Equal) is unit-tested separately from the TUI wiring. No state-format change.
  • Watch panel array expansion (issue #390, v1.3.0): :watch learns an xN (or [N]) array token — :watch grid word x16 pins 16 consecutive LE words and renders them as indexed rows grid[0..15] (header [16], first maxWatchElemRows=8 shown, rest collapsed to … +N more). Element width = the watch's byte/word kind; addresses are Addr + i*Width. symbols.Table now parses the cc65 sym size= field (Size(addr)) and seeds the count automatically when present — but the issue's premise was false: cc65 V2.18 .dbg carries no struct member layout, array bounds, or element types. C globals get bare sym ... type=lab records with no size=; even local csym records collapse every type to type id=0 val="00" (void). So struct-tree expansion is impossible from .dbg and the auto-seed rarely fires for data globals — xN is the workhorse. Scoped to array-only best-effort per that finding; struct overlays + DAP variables array children deferred (DAP has no globals scope yet). New Watch.Count is an optional v1 state field (omitempty, no schema bump). Tests: symbols size parse, :watch xN/[N] parsing + element addressing, panel render + truncation.
  • Blargg apu_test 4/8 → 8/8 PASS — Mesen2 frame-counter substeps + DMC alignment (PRs #379-#382, nessy v0.10): wired Blargg's apu_test.nes (8 sub-tests: len_ctr, len_table, irq_flag, irq_timing, len_timing, irq_flag_timing, dmc_basics, dmc_rates) into the accuracy harness (#379) and closed every gap it surfaced over three follow-up PRs. (1) 6 internal frame-counter sub-steps (#380) — Mesen2 ApuFrameCounter.h:19 table encodes the user-visible 'step 3' of 4-step mode as 3 CPU cycles (29828, 29829, 29830) where IRQ asserts continuously and the half-frame tick fires at cycle 29829. chippy's 4-entry interval table from #377 fired the tick at 29828; replaced with frameStepIntervalsNtsc4Step = [6]int{7456, 7458, 7457, 1, 1, 7457} + 5-step analogue, switch in advanceFrameStep extended to 6 cases (step 3 = IRQ-only, step 4 = q+h+IRQ, step 5 = idle/reset for 4-step). Cleared 5-len_timing. (2) DMC buffer-fill + enable-fetch + $4015 read (#381) — three real-silicon DMC behaviors chippy was getting wrong: maybeRefill was silencing whenever bufferEmpty=true at the 8-bit boundary instead of only when bytesRemaining=0 too; setEnabled didn't schedule the initial DMA fetch (Mesen SetEnabled does via transferStartDelay); $4015 read was clearing the DMC IRQ flag (per nesdev + Mesen NesApu.cpp:101, only frame-counter IRQ is cleared by $4015 read — DMC IRQ acks via $4015 write or $4010 bit-7 clear). dmcChannel now inits with bufferEmpty=true+silenced=true. Cleared 7-dmc_basics' 18 sub-tests. (3) Mesen-aligned DMC Clock (#382) — three compounding structural mismatches: chippy burned an extra 'reload-only' fire per byte (each byte = 9 fires instead of Mesen's 8), the timer reload was period+1 cycles between fires (429 vs Mesen's 428), and the fetch-schedule check only ran at byte boundaries. Replaced clockShift+maybeRefill with a unified clock() mirroring Mesen DeltaModulationChannel::Run's inner body: always shift+decrement, reload at bitsRemaining=0 boundary, schedule fetch on every clock when buffer-empty+bytes-pending. Initialise bitsRemaining=8 (matches Mesen Reset:36). Cleared 8-dmc_rates' 16 rates × 2 boundary checks. All four accuracy ROMs now PASS: ppu_vbl_nmi 10/10, instr_timing, cpu_interrupts_v2 5/5, apu_test 8/8. No regression on nestest / Klaus / demo SHAs. The DMC restructure also fixes any ROM that uses delta samples — the rate timing was off by ~12% before. Refs #318 (rolling accuracy tracker).
  • Mesen2 ProcessPendingDma port + branch IRQ-poll + NTSC frame counter (issue #376, PR #377, nessy v0.10): cpu_interrupts_v2.nes now passes 5/5 alongside ppu_vbl_nmi 10/10 + instr_timing. Three layers: (1) Ported Mesen NesCpu::ProcessPendingDma (~120 lines) into cpu/dma.go. Halt cycle is a dummy read at the opcode-fetch PC; while (dmcDmaRunning || spriteDmaTransfer) loop on cycle parity (sprite reads on getCycle, writes on putCycle, DMC reads merged, alignment dummies). dmaStartCycle/dmaEndCycle(forRead) mirror Mesen's Start/EndCpuCycle master-clock split. CPU.read calls ProcessPendingDma(addr) at the top when needHalt is set. Peripherals flip from cpu.Stall(N) + per-cycle StallStepper to bare state signals: OAMDMA.Write calls cpu.SetNeedSpriteDma(page); dmcChannel.maybeRefill calls cpu.SetNeedDmcDma(). New cpu.DMCFetcher interface (GetDmcReadAddress / SetDmcReadBuffer — APU side). Retired: cpu.Stall, PendingStall, StallStepper, SetStallStepper, stallJustDrained, the entire stall-drain branch in exec.go, dmaScheduler in cmd/nessy/wiring.go, apu.StepDMCFetch, dmcChannel.Step. pendingStall field kept as a vestigial v1 save-state slot (state format frozen). The sub-cycle ordering Mesen encodes fixes the off-by-1 IRQ service timing test 4 irq_and_dma was failing under the coarse stall model. (2) Branch IRQ-poll quirk per Mesen NesCpu::BranchRelative: a taken non-page-crossing branch ignores an IRQ asserted at its last clock so the next instruction runs before service. NMI not affected. branch() in exec.go rolls back irqPollPrev when it just rose this cycle, matching _runIrq = false in Mesen. (3) NTSC 4-step frame counter non-uniform intervals: Mesen ApuFrameCounter.h:19 table puts steps at {7457, 14913, 22371, 29828} with a 2-cycle IRQ tail and reset at 29830 (per-frame total 29830 CPU cycles). chippy's uniform quarterFrameCycles = 7457 summed to 29828 — 2 cycles short. Added frameStepIntervalsNtsc4Step = [4]int{7456, 7458, 7457, 7459} reload table; advanceFrameStep uses it on NTSC 4-step. The 2-cycle drift per frame had been breaking test 5 branch_delays_irq's BVC loop sync (loop expects exactly 29830-cycle period). All three landed together — test 5 needed both the branch quirk (later sub-tests) AND the frame counter fix (test_jmp first sub-test) to clear. nestest byte-identical; demo SHAs unchanged.
  • instr_timing + unstable illegals (issue #318, nessy v0.9): Blargg instr_timing.nes now passes (added to the accuracy harness). It validates per-instruction cycle counts end-to-end — official + NOP + alternate-SBC timing all passed straight away on the cycle-accurate core; the 8 unstable illegal opcodes were the only gap (deliberately stubbed as NOP), so they got real implementations with correct addressing modes + cycle counts: XAA/ANE ($8B), LXA ($AB), SHA/AHX ($93 izy, $9F aby), SHY ($9C), SHX ($9E), TAS/SHS ($9B), LAS ($BB). Behaviour uses the common stable approximation (0xEE magic-constant for the immediate forms; reg & (high-byte+1) for the stores). KIL/JAM stay NOP-stubbed (the test skips them). nestest byte-identical.
  • Per-cycle CPU↔PPU interleave (issue #342, nessy v0.9): the big one — Blargg ppu_vbl_nmi jumps from 5/10 to 9/10 (tests 2-9 pass; only 10 even_odd_timing remains). For VariantNES, cpu.Step now runs in 1:1 lockstep: every bus access ticks the whole chain (PPU 3 dots / APU / cart) one cycle before the access (c.read/c.write), with the 6502's dummy cycles added per addressing-mode template (c.idle, addrDummies, RMW write-back, branch/stack/control). instrCycles is asserted to equal the instruction's accounted total — nestest (run as NES) exercises every legal+illegal opcode and stays byte-identical, pinning cycle-exactness. /NMI became a level the PPU drives (updateNMIcpu.SetNMILine, = vblank-flag AND PPUCTRL.7); the CPU edge-detects after each cycle's bus op (sampleNMI), so the suppression race (test 6) falls out — a $2002 read that drops the line in the same cycle it would rise leaves no edge. The penultimate-cycle interrupt poll (nmiDue, one-cycle delay) gives the correct 1-instruction NMI latency; the edge latch is cleared before the 7 service cycles to avoid a spurious second NMI. NMOS/CMOS keep the instruction-stepped batch tick (chippy debugger, Klaus, decimal/BCD untouched). Demo SHAs unchanged (correct NMI timing realigns them; TestDemo_ASCIIReference confirms the pictures) and perfgate holds. Design + phases: docs/plans/per-cycle-cpu-ppu.md. Test 10 even_odd_timing followed: the odd-frame pre-render dot-skip now latches renderingEnabled() at dot 339 (oddSkipArmed) rather than sampling at dot 340, matching the hardware sample point relative to the $2001 BG-enable. ppu_vbl_nmi is now 10/10 — a hard accuracy gate, knownFail cleared. #342 complete.
  • NMI interrupt-poll timing (issue #342, partial — nessy v0.9): Blargg ppu_vbl_nmi tests 4 (nmi_control) + 5 (nmi_timing) now pass on top of 2+3. The 6502 samples interrupts before an instruction's final cycle, so an NMI asserted on that cycle (e.g. a $2000 write that enables NMI while the vblank flag is already set) is recognised one instruction later. Modelled with cpu.nmiDue: for VariantNES, Step advances the PPU to the penultimate cycle, polls nmiPending into nmiDue, then ticks the final cycle + runs the body; the next Step services nmiDue. NMOS/CMOS keep the immediate edge-service path (byte-identical). nmiDue is save-state-serialised. Unit-tested in cpu/nmi_poll_test.go. Remaining (#342): tests 6-10 (suppression) need a $2002 read to race the /NMI edge at sub-cycle resolution (the read must land between the PPU's flag-set and the CPU's edge-sample) — the instruction-stepped pre-tick model can't represent it; needs true per-cycle CPU↔PPU interleave.
  • PPU vbl-flag cycle timing (issue #342, partial — nessy v0.9): the 2C02's $2002 vblank-flag set/clear races now pass Blargg ppu_vbl_nmi tests 2 (vbl_set_time) + 3 (vbl_clear_time). Two pieces: (1) per-cycle $2002 read sampling — for VariantNES, cpu.Step pre-ticks the bus ticker by the base instruction length before running the opcode body so a mid-instruction $2002 read samples the PPU at its true data-access dot instead of the previous instruction boundary; branch/page-cross extras tick after. Other CPU variants (the chippy debugger) keep the post-instruction batch tick, so their behaviour is byte-identical and nestest/Klaus are untouched. (2) vbl-flag race (ppu.go) — a monotonic dots counter records the dot the flag is raised (241,1) and auto-cleared (pre-render ,1); a read landing on the set dot reads bit 7 as 0 (set hasn't propagated CPU-side) and a read on the clear dot reads the pre-clear value (still set). Unit-tested in internal/nes/ppu/vbl_race_test.go; demo SHAs unchanged (the timing realigns). Remaining (#342): tests 4+ (nmi_control, nmi_timing, suppression) need cycle-accurate NMI edge polling — the line is sampled before an instruction's final cycle, so an NMI asserted on that cycle is recognised one instruction later; the interrupt-service path also doesn't yet tick the PPU for its 7 cycles. A deeper CPU-core change.
  • AOROM / AxROM mapper (issue #360, nessy v0.9): new internal/nes/cart/aorom.go — mapper 7. Single 32 KiB switchable PRG window at $8000-$FFFF (no fixed bank); any write selects the bank (bits 0-2, up to 8 banks / 256 KiB) + the single-screen nametable (bit 4 → lower / upper). 8 KiB CHR-RAM. No bus conflicts (AOROM proper; the AMROM/ANROM conflict variant is sub-mapper-gated like UNROM #319, deferred). Unlocks Battletoads, Marble Madness, R.C. Pro-Am, Wizards & Warriors, Jeopardy!. cart.Open dispatches mapper 7; cart-state union extends with AOROMState. Tests cover 32K bank switch (full-window — any $8000-$FFFF write latches), single-screen toggle, CHR-RAM round-trip, dispatch, save/restore.
  • nessy demo: mmc3-split (issue #323, nessy v0.9): new roms/demos/mmc3-split/ — MMC3 scanline-IRQ status-bar split, unblocked by the per-scanline A12 clock (#352). Flat-colour screen (blank nametable → universal BG colour); MMC3 IRQ armed via $C000=120 / $C001 reload / $E001 enable to fire ~120 scanlines in. IRQ handler rewrites $3F00 blue→green + acks ($E000/$E001); NMI restores blue + reloads the counter ($C001) each frame. Result: top blue, bottom green — driven by the mapper's A12-counted IRQ, not sprite-0-hit or cycle timing. 32 KiB PRG (4 × 8 KiB MMC3 banks); code + vectors in the fixed last bank at $E000. Test asserts top-row colour ≠ bottom-row + each region internally uniform. Proves #352's A12 clock end-to-end.
  • PPU per-scanline A12 clock (issue #352, partial — unblocks #323): real 2C02 does sprite-pattern fetches every scanline during hblank (dots 257-320), toggling PPU A12 even with zero sprites in range; MMC3's scanline IRQ counts those rising edges. Our burst renderer skipped the garbage fetches, so MMC3 IRQ only ticked on the ~8 scanlines with in-range sprites → wrong split position. Fix: stepDot emits one dummy sprite-pattern-table read (busRead(0x1000)) at dot 260 on every visible + pre-render scanline when rendering is enabled, after the dot-256 BG fetch has driven A12 low (common BG=$0000 / sprite=$1000 config). Value discarded, framebuffer untouched → every demo SHA holds; the only effect is the cart's A12 edge detector ticking. Test (internal/nes/ppu/a12_test.go, real cart.MMC3 + counting sink): the MMC3 scanline IRQ now fires ~30×/frame at latch 8 with no sprites (was 0); rendering-off → 0. First half of the #352 per-dot/per-cycle cluster; the #342 $2002-read-vs-vbl-set per-cycle half remains.
  • Public API doc + semver contract (issue #350, nessy v0.9): new docs/api.md documents the now-public 6502 core surface — cpu / peripheral / symbols / loader / expr / trace / dap — with a stability table (bare vX.Y.Z tags = library semver; cpu.Bus / cpu.Peripheral / cpu.Ticker / cpu.Variant are the contract types that bump major on change), per-package role + key exported types, a minimal CPU-loop example, and an explicit "what's NOT public" (internal/tui chippy-only; internal/nes/* → nessy repo at #351). Added to the mkdocs Reference nav. Pairs with #349 to give the standalone nessy repo + third parties a stable contract to pin.
  • Publicize shared core (issue #349, nessy v0.9): moved the 6502 core packages out of internal/ to public top-level paths so the future standalone nessy repo (#351) can import them — Go's internal/ blocks external modules. internal/{cpu,dap,symbols,peripheral,expr,loader,trace}github.com/nkane/chippy/{cpu,dap,...}. internal/tui (chippy-only) + internal/nes/* (nessy-only, moves at #351) stay private. Pure git mv + import-path rewrite across every package + the build-tagged nessy / wasm / record / accuracy / nestest sources; no behaviour change. The opcode-init lex-order invariant (opcodes.go < opcodes_cmos.go < opcodes_illegal.go) is preserved — the files moved as a unit inside cpu/. First v0.9 item; unblocks the repo split (#351) + the public-API doc (#350).
  • 23, #24, #26, #27, #28, #29 — earlier infra / features

  • 30 — Klaus functional test harness (GPL ROM, downloaded on demand, sha256 verified)

  • 31 — Cycle audit; introduced extraCycles side channel; fixes taken-branch undercount in Step() return

  • 32 — Full 65C02 CMOS support (variant enum, table dispatch, ~30 opcodes, 3 new addr modes, JMP-IND wrap fix, CMOS BCD with +1 cycle, WDC NOP fill, --cpu flag, ca65 demo + e2e test)

  • 33 — IRQ/NMI with edge/level semantics

  • 34 — MMIO peripheral abstraction (issue #16); routing bus + Apple-1-style TextOutput ($F001) and KeyboardInput ($F004/$F005)

  • 36 — Per-instruction execution trace (issue #21): cpu.Tracer hook on Step(), cpu.FileTracer (buffered 64K), -trace PATH CLI flag, :trace PATH|on|off TUI command

  • 38 — CLAUDE.md "docs are part of every PR" rule: README/context/help-modal/exported docs move with code

  • 39 — Stack panel JSR-frame annotation (issue #18): detects pushed return-address pairs via the $20 opcode at stored-2; renders ret $XXXX callee file:NN; collapses non-frame runs; T toggles raw view

  • 40 — Memory editor (issue #19): byte-level MemCursor (arrow keys, auto-scroll), e enters hex edit mode at cursor; 1–2 hex chars, Enter commits, Esc cancels; cursor persists in state file; :goto aligns view AND moves cursor

  • 41 — Prompt history + tab-complete (issue #20): ~/.chippy/history (cap 100, dedup, auto-save), Up/Down recall, Tab completes verbs and :bp <symbol> against the loaded .dbg, Ctrl-R reverse-incremental search (Ctrl-R again walks older). Added symbols.Table.NamesWithPrefix.

  • v0.0.2 — release cut after #41. 7 features since v0.0.1; binaries + brew tap auto-updated.
  • 54 — Reverse step (issue #17): cpu.Snapshot / CPU.Snapshot/Restore capture full regs + RAM + bookkeeping; rewindRing (cap 256, FIFO eviction, LIFO pop) records pre-step state on explicit-step paths only (free-run skipped to avoid 64 KiB/step cost); < pops one; status bar shows rwd:N depth.

  • 55 — CMOS-aware disasm (issue #42): DisasmCPU / DisasmCPUWithSyms route through the CPU's opcode table so CMOS-only mnemonics (STZ/PHX/BRA/etc.) render correctly in the disasm panel, trace lines, and any future caller. Legacy Disasm/DisasmWithSyms retained as NMOS-default shims.

  • 56 — -run-on-start flag (issue #44): start the CPU running instead of paused; pair with -trace for non-interactive capture.

  • 57 — Trace interrupt-entry lines (issue #43): Tracer.LogInterrupt hook + FileTracer emits ---- NMI -> $FFFA (PC=... P=... SP=... CYC:...) markers at the service boundary, so trace readers can spot the PC jump in the next instruction.

  • 58 — Stack heuristic tightening (issue #45): detectStackFrame now also rejects frames whose stored return-address or JSR target falls below codeMinAddr = $0200 (zero-page + stack-page). Cuts most false positives without losing real frames.

  • 68 — Help modal paging: 4 pages, space/→ next, p/← prev, any other key closes. Splits the 10-section keybinding reference so the modal fits on small terminals.

  • 69 — DAP transport + initialize/launch/disconnect (issue #47): internal/dap package with Content-Length framing, request/response/event types, server dispatch loop; -dap stdio | tcp:PORT CLI flag. Launches construct CPU+RAM+MMIO from LaunchArguments matching the CLI flag shape; capabilities advertise everything #48–#53 will eventually wire (conditional bp / instruction bp / disassemble / readMemory / writeMemory etc.).

  • v0.1.0 — release cut after #69. Minor bump signals new DAP subsystem.
  • DAP step controls (issue #50): continue / next / stepIn / stepOut / pause / threads. continue spins a background goroutine that calls cpu.Step until pauseRequested flips true or the CPU halts; emits stopped event on exit. Single-step variants refuse while running and emit stopped after the synchronous step.
  • DAP stackTrace / scopes / variables / setVariable (issue #48): stackTrace walks JSR frames via cpu.DetectStackFrame (moved from tui/stack.go); scopes returns Registers + Flags; variables emits A/X/Y/SP/PC/P/Cycles for ref=1 and N/V/U/B/D/I/Z/C for ref=2; setVariable writes registers or toggles flags. Stack-frame detection moved from tui to cpu package as cpu.DetectStackFrame/StackCodeMinAddr.
  • DAP breakpoints (issue #49): setBreakpoints (source-line, resolved via srcMap.PCToSrc reverse-lookup) and setInstructionBreakpoints (address). Both are destructive against their respective namespace per DAP spec. Run loop checks bpHit (flattened union) at each Step and emits stopped with reason=breakpoint.
  • DAP disassemble / readMemory / writeMemory (issue #51): disassemble routes through cpu.DisasmCPU (variant-aware), reports address, instructionBytes, instruction, symbol, location/line; readMemory/writeMemory bypass MMIO so peripheral side-effects don't fire on debugger pokes. Base64 envelope per spec.
  • DAP evaluate (issue #52): evaluate request compiles + runs the same expression grammar used by :bp X if E. Expression compiler moved from internal/tui/cond.go to a new internal/expr package so DAP and TUI share semantics (expr.Compile, expr.EvalFn); tui.compileCondition is now a thin wrapper.
  • DAP example configs + onboarding docs (issue #53): docs/dap.md walkthrough, examples/dap/launch.json (VS Code) and examples/dap/nvim-dap.lua (nvim-dap). DAP-v1 epic complete.
  • v0.2.0 — release cut after #77 / DAP-v1 epic.
  • DAP stepBack (issue #79, first of #78 DAP-v2 epic): wires the rewind ring into DAP. Snapshot ring promoted from internal/tui to internal/cpu as cpu.SnapshotRing so both the TUI's < key and DAP's stepBack share storage. supportsStepBack: true.
  • DAP setFunctionBreakpoints (issue #82, DAP-v2): symbol-name bps via syms.LookupName. New bpsByName map joins bpsBySrc and bpsInst in rebuildBPHit's union. supportsFunctionBreakpoints: true.
  • DAP loadedSources + source (issue #84, DAP-v2): editor's Loaded Scripts pane lists every file in srcMap.Files; the source request returns joined-line content with basename fallback for clients passing absolute paths. supportsLoadedSourcesRequest: true.
  • DAP backward disassemble (issue #80, DAP-v2): walkBack promoted from internal/tui to internal/cpu as cpu.WalkBack; DAP's disassemble handler uses it for negative instructionOffset. Heuristic tightened to prefer earliest-start at equal sequence length, biasing toward real code boundaries.
  • DAP completions (issue #85, DAP-v2): debug-console autocomplete returns registers (A/X/Y/P/SP/PC), flag bits (N/V/B/D/I/Z/C), and .dbg symbol names matching the cursor's trailing identifier prefix. supportsCompletionsRequest: true.
  • DAP exception bps (issue #83, DAP-v2): brk filter advertised in initialize as exceptionBreakpointFilters. setExceptionBreakpoints flips brkOnException; run loop pauses before any $00 opcode and writes lastExceptionPC for the exceptionInfo response. supportsExceptionInfoRequest: true.
  • DAP bp condition/hitCondition/logMessage (issue #81, DAP-v2): every breakpoint family (source-line, instruction, function) honors the DAP modifier triple. New bpMeta per PC carries the compiled expr.EvalFn, hit target + running count, and an interpolating log template. shouldFireBP is the run-loop hit handler — logMessage emits an output event then continues without stopping.
  • DAP integration test (issue #86, DAP-v2): internal/dap/integration_test.go under build-tag integration. Builds the binary, spawns chippy -dap stdio, drives initialize → launch → setInstructionBreakpoints → continue → variables → stackTrace → disconnect via an in-test JSON wire client. New dap-integration CI job runs it on every push.
  • DAP attach v1 (issue #87, DAP-v2): Server.AttachExisting(AttachConfig) populates debuggee from an externally-built CPU/RAM/MMIO bundle without going through the loader. attach request now responds OK + emits stopped(entry) when a debuggee is wired. The TUI plumbing (:dap PORT command + shared CPU mutex) is deferred to #97.
  • v0.3.0 — release cut after DAP-v2 push.
  • Klaus 65C02 functional test (issue #59): internal/cpu/klaus_cmos_test.go runs against 65C02_extended_opcodes_test.bin (download-on-demand + sha256-pinned). v1 skipped behind CHIPPY_KLAUS_CMOS_STRICT env because chippy's CMOS undocumented-opcode slots aren't WDC-spec'd yet (bug #99); test infrastructure is ready for #99's fix to be validated against.
  • Exhaustive BCD test (issue #60): internal/cpu/decimal_exhaustive_test.go (build tag decimal) walks every (N1, N2, cin) through ADC and SBC in decimal mode for both variants — 524 288 cases total. Caught a real CMOS BCD bug on invalid-nibble inputs; fix applied to adcDecimalCMOS / sbcDecimalCMOS (Bruce Clark Appendix B algorithm). New decimal CI job runs the suite on every push.
  • CMOS e2e CI (issue #61): new cmos-e2e workflow job installs cc65, builds example/cmos_demo.bin, runs the existing e2e test with CHIPPY_CMOS_E2E_STRICT=1 so missing fixtures fail the build instead of silently skipping.
  • CMOS NOP fills + interrupt D-clear (issue #99): WDC-spec NOP widths for undefined CMOS slots ($44=ZP, $54/$D4/$F4=ZPX, $DC/$FC=ABS, $5C=ABS-quirky 8-cycle). BRK / serviceIRQ / serviceNMI now clear D on CMOS variant (NMOS bug preserved). Klaus 65C02 functional test now passes end-to-end and runs unconditionally in CI.
  • v0.3.1 — patch release after the CMOS correctness pass.
  • example/c/ — cc65-based C example programs (hello, sum, fizzbuzz) with shared chippy.cfg linker config + minimal crt0.s runtime. Builds via make -C example/c; runs via chippy -rom example/c/<prog>.bin. Source-map loader updated to prefer .c files over .s intermediates when both are recorded for the same PC, so the TUI source view (v) shows C source while stepping.
  • Immediate window (issue #70): I opens a modal REPL backed by internal/expr. Each Enter evaluates the buffer against current CPU state, appends expr → result to scrollback. recalls the last expression. Result formatting matches DAP's evaluate response so both surfaces report identical values.
  • Peripheral snapshots (issue #62): cpu.Snapshot grew a Peripherals map[string][]byte field; TUI and DAP both capture TextOutput buffer + Keyboard latch state into it on every push and restore on every pop. New peripheral.Snapshotable interface (Snapshot/Restore); both TextOutput and KeyboardInput implement it. Reverse-step across an MMIO write/read no longer desyncs the visible peripheral state.
  • CoW RAM snapshots (issue #66): cpu.Snapshot.RAM [0x10000]byte is now Pages map[byte][256]byte. RAM gained an opt-in (EnableShadow) page-level write barrier that captures pre-write images. Two-phase capture protocol — caller takes the snapshot before the step, resets the shadow, runs the step (or multi-step sweep), then claims snap.Pages = ram.TakeShadow() and pushes. Typical 1-instr snap is ~hundreds of bytes vs 64 KiB before, so free-run now pushes on every step in both TUI tickMsg loop and DAP runLoop — reverse-step works across an unattended continue. 1000-iteration tight loop costs <1 MiB of total ring storage (validated by test).
  • VS Code extension (issue #88): extension/vscode-chippy/ — minimal TypeScript package that registers the chippy debug type and supplies a DebugAdapterDescriptorFactory that spawns chippy -dap stdio. package.json declares launch attributes, configuration snippets, and a chippy.binaryPath setting. npm run package produces an installable .vsix.
  • WebAssembly playground (issue #67): cmd/chippy-wasm/ builds a js/wasm binary that installs a window.chippy global (load / step / run / state / disasm / readMem / textOutput / pushKey / setVariant). web/ ships the HTML/JS shell — make -C web serve builds + serves on :8080. Demos copy from example/. ld65/.o pipeline is explicitly out of scope (no shell-out in the browser); .bin / .prg / .hex parsing is inlined in the WASM main. New wasm CI job keeps the build target green. GitHub Pages auto-deploy via pages.yml.
  • v0.4.0 — release cut after #62 / #66 / #88 / #67 ship.
  • CPU correctness micro-audit (issue #122): WAI ($CB) and STP ($DB) were placeholder NOPs; now WAI halts until any IRQ/NMI (waking even on masked IRQ — falls through to next instruction without dispatching the handler) and STP halts permanently (new stoppedBySTP latch; only Reset() clears). Regression tests cover the halt/wake matrix plus IZP $FF zero-page wrap, PHP B/U push, IRQ B-clear push, and CMOS RTI D-restore.
  • expr unary minus width-aware (issue #129): -1 now evaluates to $FF instead of $FFFFFFFF; pick-smallest-power-of-two-width rule keeps A == -1 matching a register holding $FF. Binary subtraction stays 32-bit modular by design. First-ever tests for internal/expr/.
  • TextOutput bounded buffer (issue #128): peripheral.TextOutput now drops the oldest quarter when its buffer hits cap (default 64 KiB; --text-buf-cap overrides; 0 = unbounded). New :textsave PATH TUI command dumps the live buffer to disk. Prevents OOM on long-running programs and keeps reverse-step snapshots bounded.
  • DAP advertised-but-missing gaps (issue #123): supportsBreakpointLocationsRequest was advertised but had no handler — now wired (line-granularity lookup against srcMap.PCToSrc). launch.stopOnEntry and attach.stopOnEntry are now *bool; explicit false auto-starts the run loop / suppresses the entry stopped event. writeMemory.allowPartial=false rejects overflowing writes instead of silently truncating.
  • DAP input validation hardening (issue #124): readMemory rejects negative Count; disassemble clamps large-negative Offset and rejects negative InstructionCount; evaluate refuses while the run loop is in flight (was racing CPU/RAM reads); stepOut detects SP rises across the 8-bit wrap via signed-delta comparison; duplicate source-line and instruction breakpoints surface a verified:false "duplicate ... — first entry kept" message instead of silently overwriting.
  • TUI help-modal + Tab completion polish (issue #127): help modal grew a "Prompt verbs" section listing every : command with concrete syntax (no more "guess the modifier syntax"). Tab completion extended beyond verb-only — :trace on/off, :speed <hz>, :bp X <modifier> (once/hits/if/log), and the new :textsave verb all complete from arg-pool. Symbol completion still works at arg-1 of address-taking verbs.
  • State-file format freeze (issue #112): new StateSchemaVersion = 1 written into every saved file. Loader treats absent version as v0 legacy (still decodes), == 1 as current, > 1 as silent ignore so an older chippy preserves a newer build's state. internal/tui/testdata/state-v1.json is the pinned golden; TestLoadState_GoldenV1 fails when a tag or struct field changes incompatibly. docs/state-format.md documents the contract; CLAUDE.md cross-references it. Pre-existing bug fixed along the way: loadMemBPs was only called on the legacy-decode path, dropping memory watchpoints from any new-shape file.
  • State-file content completeness (issue #125): savedState grew DisasmFollow, StackAnnotate, InputMode, DisasmAnchor, and ImmediateHistory. The two booleans serialize as *bool so a legacy v0 file's absence doesn't clobber the New(c, r) true defaults. Loader gates the new fields on SchemaVersion >= 1 for the same reason. Golden file extended; new tests cover legacy-defaults-preserved and round-trip-of-additions.
  • Release hardening (issue #130): goreleaser builds gain -trimpath + -buildvcs=true for reproducible / verifiable provenance; cosign keyless signing produces *.cosign.bundle per artifact (verify via cosign verify-blob --certificate-identity=https://github.com/nkane/chippy/.github/workflows/release.yml@refs/tags/<TAG> --certificate-oidc-issuer=https://token.actions.githubusercontent.com); syft emits SPDX SBOMs per archive; CI gained a govulncheck job that runs on every push; npm Dependabot now tracks the VS Code extension's deps; SECURITY.md documents the reporting flow + the hardening baseline.
  • Docs hygiene (issue #131): README grew a "Why chippy" section with positioning vs py65 / lib6502 / visual6502; new CONTRIBUTING.md covers branch flow + commit style + quality bar + the "docs are part of every PR" rule; new CHANGELOG.md in Keep-a-Changelog format backfills v0.0.1 → v0.4.0 + an Unreleased section tracking the v1.0 epic; new docs/editors.md carries the editor-integration matrix.
  • Perf baseline + CI regression gate (issue #113): internal/cpu/bench_test.go ships three benchmarks — BenchmarkStep_NMOS, BenchmarkStep_CMOS, BenchmarkStep_WithSnapshot. New perfgate build-tag test compares measured ns/op against testdata/perf-baseline.json and fails on >15% regression. New perf baseline CI job runs the gate on every push. Refresh procedure documented in docs/perf-baseline.md.
  • NO_COLOR + colorblind themes (issue #126): new internal/tui/theme.go defines four palettes — default, mono, protan (red-green safe), tritan (blue-yellow safe). NO_COLOR env forces mono regardless of --theme. :theme NAME runtime command; persisted in the state file's new theme field; arg-completed by Tab. Help modal grew a Theme section. Tests cover env routing, applyTheme global swap, and round-trip persistence.
  • WASM playground hardening (issue #132): new boot-error banner renders the underlying WASM-load failure (e.g. file:// MIME refusal) with a hint to use make -C web serve. CSP meta enforces default-src 'self' + frame-ancestors 'none' for clickjacking + script-injection defense. New sw.js service worker caches static assets for offline use. share button copies a #rom=<base64>&format=&addr=&variant= permalink (bytes stay client-side via URL fragment). Mobile-responsive: panes reflow to single column under 800 px.
  • VS Code extension tests + disconnect docs (issue #133): @vscode/test-electron harness compiles src/test/{runTest,suite/index,suite/extension.test}.ts. Smoke tests cover presence + activation + manifest-declared debug type + the chippy.binaryPath setting. New vscode-ext CI job runs the suite under xvfb-run. package-lock.json is now committed so npm ci is deterministic. Extension README documents disconnect / crash handling and the test command.
  • ca65 syntax highlighting (issue #117): TextMate grammar at extension/vscode-chippy/syntaxes/ca65.tmLanguage.json covers NMOS + 65C02 mnemonics, directives (.proc, .segment, .byte, .if, etc.), hex / binary / decimal literals, labels, comments, registers, operators. Files matching .s / .s65 / .asm / .inc are auto-tagged. New ca65.language-configuration.json enables ; comment toggling + bracket pairing. Snippets file ships reset-vector / .proc / .ifdef / halt-loop / Apple-1 putc templates. .vsix now ships 8 files, 7.37 KB.
  • CMOS 65C02 cycle audit (issue #111): new internal/cpu/cmos_cycles_test.go exercises CMOS-only opcode cycles (BRA / INA / DEA / PHX / PLX / PHY / PLY / STZ / TRB / TSB / JMP (abs,X) / LDA (zp) / RMBx / SMBx / BBRx / BBSx), the BCD +1-cycle penalty under FlagD on ADC / SBC, the WDC NOP fills (1-byte/1-cycle defaults + the documented ZP/ZPX/ABS multi-byte slots + the quirky 8-cycle $5C), and WAI / STP. Surfaced two bugs in the CMOS opcode table: BRA was base-3 (computed as 4/5 instead of 3/4 because branch() adds +1 for always-taken), and the ZPX-prefixed NOP slots $54 / $D4 / $F4 were incorrectly routed through case 0x04 and registered as 2-byte/3-cycle ZP NOPs instead of 2-byte/4-cycle ZPX NOPs. Both fixed; Klaus still green.
  • DAP TUI attach (issue #97): new :dap PORT TUI command spawns the embedded DAP server in attach mode against the live CPU. AttachConfig.CPUMu carries a shared *sync.Mutex the DAP dispatch() and runLoop() take per iteration; Model.step() takes the same mutex when set. Model.DAPListenAddr surfaces the TCP address; :dap reports state, :dap stop closes the listener. Model.SrcMap retains the live symbols.SourceMap pointer so the embedded server can resolve source breakpoints. Race-detector test confirms step() blocks while the mutex is held.
  • Linux distribution beyond brew (issue #118): goreleaser nfpms: block produces .deb, .rpm, and .apk packages per release. New aurs: block publishes chippy-bin PKGBUILD to AUR (gated on AUR_SSH_PRIVATE_KEY secret; skip-upload-auto so dev builds don't push). README install table covers Debian/Ubuntu (dpkg -i), Fedora/RHEL (rpm -i), Alpine (apk add), Arch (yay -S chippy-bin).
  • Examples expansion (issue #119): four new ca65 demos. mul16.s (16x16 → 32-bit shift-add multiply, ZP state, ADC carry propagation), echo.s (Apple-1 I/O — poll $F005, read $F004, echo to $F001), timer_irq.s (IRQ vector + RTI handler alongside a busy main loop), guess.s (interactive number-guess state machine driven from MMIO keyboard). All build through the existing Makefile; README grouped under math / arithmetic / I/O / interrupts / CMOS categories with explicit watching tips.
  • chippy.dev documentation site (issue #116): new mkdocs.yml configures MkDocs-Material; docs/index.md + docs/quickstart.md land as new pages alongside the existing reference docs. Pages workflow installs mkdocs-material, builds docs/ to _site/, copies web/ into _site/playground/, and uploads the combined artifact. Site root becomes the docs landing; /chippy/playground/ is the WASM playground. mkdocs build --strict is part of the deploy gate.
  • VS Code marketplace publish prep (issue #114): release workflow gains a vscode-extension job that syncs extension/vscode-chippy/package.json to the tag version (npm version --no-git-tag-version), npm ci && npm run compile, then vsce publish --pat $VSCE_PAT. Skipped on prerelease tags (anything with - in the name) since the marketplace can't unpublish. Extension README documents the PAT setup. AUR_SSH_PRIVATE_KEY secret is now also passed through so the AUR upload from #118 fires on real releases.
  • Trace replay (issue #64): new internal/trace package parses chippy's -trace output back into a navigable []Frame. cmd/chippy --trace-replay PATH opens the TUI in replay mode — s advances a frame, < rewinds, the CPU's regs are synced from the active frame so every panel renders as if paused at that PC. Help-modal grew a "Trace replay" section. Tests cover parse-basic, step+seek, malformed/empty lines.
  • CPU bus-ticker hook (issue #175, nessy v0.1 spike): new cpu.Ticker interface; cpu.Step() invokes Tick(cycles) after every instruction. Cached at c.busTicker via c.SetBus(b) so the no-ticker fast path is a single nil-check (BenchmarkStep_NMOS stays ~8 ns/op, BenchmarkStep_WithTicker adds <1 ns). MMIO fans out to peripheral Tickers + forwards to its Inner bus's Ticker. tui.WBus forwards to its inner. Foundation for the nessy PPU / APU.
  • CPU VariantNES (issue #174, nessy v0.1): Ricoh 2A03 variant — NMOS opcode table, but ADC / SBC ignore FlagD even when set. FlagD itself still toggles via SED / CLD / PHP / PLP for programs that probe it; only the BCD adder is missing in silicon. -cpu nes / cpuVariant: "nes" (DAP launch) / chippy.setVariant("nes") (WASM) all accept the new variant. Klaus 6502 functional test untouched; perfgate ceilings hold.
  • nes/ines loader (issue #173, nessy v0.1): new internal/nes/ package. Parse(io.Reader) / ParseBytes([]byte) consume an iNES (or NES 2.0) file into a *ROM{Mapper, Mirroring, Battery, Trainer, PRG, CHR, NES2}. Header validation rejects bad magic, zero PRG-bank claims, truncated bank data. Mapper byte assembled from flag6 high nibble + flag7 high nibble; mirroring honors the four-screen override. NES 2.0 header detected via flag7 bits 2-3 = 0b10 (extensions are best-effort parsed for v0.1). Tests cover NROM happy path, mapper-byte assembly (NROM, MMC1, MMC3, AOROM, GxROM), mirroring matrix, battery flag, trainer, CHR-RAM, NES 2.0 detection, 5 malformed-input rejections.
  • nes/cart NROM (issue #176, nessy v0.1): new internal/nes/cart/ subpackage. Cartridge interface (CPURead/CPUWrite/PPURead/PPUWrite/Mirroring); Open(*nes.ROM) factory dispatches by mapper number. v0.1 ships mapper 0 (NROM) only — 16 KiB carts mirror $8000-$BFFF to $C000-$FFFF, 32 KiB carts map directly. CHR-RAM variant (rom.CHR nil) allocates 8 KiB and makes PPU writes effective; CHR-ROM carts silently drop writes. Tests cover 32K direct map, 16K mirror, unmapped-below-$8000, PRG-write-noop, CHR-ROM read-only, CHR-RAM round-trip, PPU-above-$1FFF-ignored, bad PRG size rejection, Open dispatch + unsupported-mapper error.
  • nes/joypad (issue #178, nessy v0.1): new internal/nes/joypad/ package. Port is a cpu.Peripheral claiming $4016-$4017 with two Controllers. Strobe-line model: a write to $4016 bit 0 latches both controllers' shift registers; reads from $4016 / $4017 shift out A, B, Select, Start, Up, Down, Left, Right in that order. While the strobe is held high reads return the live A-bit continuously; after the eighth read the register is drained and reads return 1 (open-1 silicon). Host-side Set(Button, pressed) mutator drives live state; Ebiten input mapping will wire up in #179 when cmd/nessy lands. Tests cover shift order, strobe-high continuous read, latch-at-strobe snapshot semantics, P1/P2 independence, and $4017 write isolation.
  • nes/ppu (issue #177, nessy v0.1): new internal/nes/ppu/ package. PPU is a cpu.Peripheral claiming $2000-$3FFF (8-byte register window mirrored). 2 KiB internal VRAM with cart-driven horizontal / vertical / four-screen mirroring; 256 B OAM; 32 B palette RAM with $3F10/$14/$18/$1C hardware mirror to $3F00/$04/$08/$0C; pattern-table reads / writes routed to the cart's PPU bus. Tick(cpuCycles) advances 3 dots per CPU cycle through the 341 × 262 NTSC frame; vblank flag flips at scanline 241 dot 1 and raises NMI if PPUCTRL bit 7 is set (incl. the 2C02 quirk where setting bit 7 mid-vblank fires an immediate NMI). $2007 reads are buffered (palette reads bypass); $2006 / $2005 share the standard write-toggle. Background-only renderer fires at vblank entry: walks 30 × 32 tiles, fetches nametable + attribute + pattern, emits 256 × 240 RGBA into FrameBuffer() using the embedded 64-color 2C02 palette. Out of scope for v0.1 (deferred to v0.2+): sprites, sprite-0 hit, mid-frame scrolling (v/t/x/w latch model), $4014 OAMDMA (needs a CPU-stall hook that doesn't exist yet), greyscale / color emphasis, pre-render dot skip. Tests cover vblank timing + NMI gating, status-read clears vblank + latch, late-NMI quirk, VRAM r/w with auto-increment 1 and 32, palette buffer bypass, scroll latch toggle, palette mirror, horizontal + vertical nametable mirroring, OAMDATA cursor bump, mirrored-register-window decode, and synthetic-CHR uniform-background render against the 2C02 palette table.
  • DAP client + -dap-attach Phase A (issue #180, nessy v0.1, partial): new dap.Client (Dial / NewClient / Initialize / Attach / Disconnect / Request / Events / Close) — the editor-side counterpart of the existing dap.Server. Read goroutine demuxes wire bytes into per-seq response channels and a buffered events fanout. chippy -dap-attach tcp:HOST:PORT drives the initialize+attach handshake, prints the server's advertised capabilities + initial events, then disconnects cleanly. Mutually exclusive with -rom / -dap. Phase B (introduce CPUSource interface, retrofit local mode) and Phase C (wire a DAP-backed source under attach so step keys/breakpoints drive the remote) are tracked separately and follow in their own PR — the TUI does not run in attach mode yet, since it can't drive a remote CPU until the refactor lands. Client tests use in-process net.Pipe to drive a real dap.Server through Initialize+Attach+Disconnect, concurrent-request demux, post-close request fail-fast, and parseDialAddr parsing.
  • cmd/nessy (issue #179, nessy v0.1): new cmd/nessy/ binary — the NES emulator entry point that ties together iNES loader (#173), NROM cart (#176), CPU VariantNES (#174), PPU (#177), and joypad (#178) under an Ebiten game loop, with chippy's DAP server attached on TCP for remote TUI control. Build-tagged nessy because Ebiten requires X11 / GL dev headers on Linux that the default CI runners don't carry — a !nessy stub in main_stub.go prints build instructions so go build ./... stays green. Wiring lives in wiring.go (untagged, unit-testable on every platform): the construction order is cart → MMIO over RAM → register cart-CPU-side wrapper ($4020-$FFFF) + joypad ($4016-$4017) → CPU on MMIO (Reset reads the cart's $FFFC vector through the registered wrapper) → PPU on cart + CPU → register PPU on MMIO → final Reset. Game loop runs ~29830 CPU cycles per Update (1.789773 MHz ÷ 60 fps), polls keyboard → joypad (Arrows / Z / X / Enter / Right-Shift = D-pad / A / B / Start / Select), and blits ppu.FrameBuffer() to the Ebiten screen via WritePixels with Layout(256, 240) for Ebiten-managed integer scaling. DAP listener runs in a goroutine via AttachExisting with a shared cpuMu mutex — same pattern chippy's :dap command uses. Flags: -rom PATH (or positional), -dap-port N (default 14785, 0 to disable), -scale N (default 3), -mute (no-op until APU lands in v0.2). Wiring tests verify cart-driven reset-vector wiring, CPU → PPU register reach via MMIO, and the cart's $4020-$FFFF range claim. Known v0.1 limitation: when a DAP client issues continue, the server's run loop and the game loop both call Step() under the mutex, double-stepping. Real pause/run gating with an atomic Paused flag is a v0.2 polish item.
  • nestest golden-PC walk (issue #181, nessy v0.1): new cmd/nessy/nestest_test.go runs the canonical nestest.nes headless test at PC=$C000, single-stepping the live CPU against nestest.log (Nintendulator reference) and asserting PC + A + X + Y + P + SP match line-by-line. Fixtures downloaded on first run from https://www.qmtpro.com/~nes/misc/ and cached under $XDG_CACHE_HOME/chippy-tests/ with SHA-256 pins (f67d55fd… ROM / 627c8e18… log); CHIPPY_NESTEST_BIN / CHIPPY_NESTEST_LOG env vars override with a local path. Build-tagged nestest (mirrors klaus); new CI job nestest on ubuntu-latest. Validates the iNES → cart → MMIO → CPU + PPU integration end-to-end — the headless run passes 8991 instructions with zero divergences against the golden log. Closes the v0.1 acceptance criterion that "nestest's PPU-touching opcodes don't trip" (#177) and "nestest passes end-to-end in CI" (#182).
  • nessy demo: hello-bg (issue #194, epic #193): new roms/demos/hello-bg/ — first homemade nessy demo. Static title screen renders "HELLO NESSY" centered on the nametable. Hand-rolled ca65 source + ld65 NROM-128 config + 8 KiB inline CHR-ROM with glyphs at their ASCII tile indices. Built .nes committed so toolchain installation isn't required. roms/demos/Makefile builds via make hello-bg (cc65 toolchain). Test driver cmd/nessy/demo_test.go boots the ROM through buildNES headlessly, advances 5 frames (driving the PPU manually past the JMP self halt detection so the renderer catches post-init state), hashes the framebuffer, and compares to a pinned SHA-256 (helloBGFrameSHA). Inspect variant (CHIPPY_DEMO_INSPECT=1) dumps a textual screenshot. Validates full PPU pipeline end-to-end with a real hand-rolled ROM: palette write via $2006/$2007, nametable + attribute clear loop, tile string write, BG-show enable, vblank-wait spin loops, reset vector wiring, and the NROM cart's $4020-$FFFF CPU-side claim. Along the way exposed a CPU bug: the chippy-debugger heuristic that flips c.Halted = true on JMP self was wrong for NES code (legitimate NMI-driven idle pattern stalls the bus-ticker fan-out → PPU freezes). Fix: c.Variant != VariantNES gate; NMOS/CMOS behavior unchanged.
  • nessy demo: vblank-bounce (issue #196, epic #193): new roms/demos/vblank-bounce/ — third homemade demo. Single 8×8 tile bounces inside the playfield, position updated by the NMI handler each frame. PPUCTRL bit 7 enables NMI; vblank @ scanline 241 raises it; the NMI handler erases the current cell, advances (pos_x, pos_y) with edge-clamped bounce in [2, 29] × [2, 27], draws at the new cell, restores scroll, RTIs. Main loop is the canonical NES JMP self idle — exercises the CPU's variant-gated halt heuristic (NES skips the heuristic; KIL/STP still halt). Tests pin two framebuffer SHAs at 5 and 30 frames; assert they differ (catches a frozen NMI line / broken erase path). Inspect helper accepts CHIPPY_DEMO_FRAMES env to watch the tile walk frame-by-frame.
  • nessy demo: input-echo (issue #195, epic #193): new roms/demos/input-echo/ — second homemade demo. Eight indicator boxes in a controller layout (D-pad left, Select/Start/A/B right); each frame the program strobes $4016 and reads 8 bits, flipping each indicator tile between empty ($30) and full ($31). Exercises joypad serial-shift reads, per-frame VRAM writes inside vblank, and scroll-reset after $2006. Tests cover two pinned framebuffer SHAs — idle (all empty) and ButtonUp-held (Up indicator full). Asserts the two SHAs differ to catch a broken joypad path. Inspect helper TestDemo_InputEcho_Inspect with CHIPPY_DEMO_INSPECT=1 + optional CHIPPY_DEMO_BUTTON env renders a textual screenshot for any button combination.
  • DAP attach Phase B+C (issue #180, nessy v0.1): new internal/tui/source.go introduces a Source interface that owns step / reset / continue / pause / breakpoint / step-back control flow + an async event stream. LocalSource (default) is a thin wrapper around *cpu.CPU + *cpu.RAM; RemoteSource (source_remote.go) wraps a dap.Client and translates each operation into a DAP request. Mirror model: the TUI's m.CPU + m.RAM stay populated in both modes so every display panel keeps reading the same fields — RemoteSource writes the mirror after every operation via stackTrace → PC and variables(refRegisters) → A/X/Y/SP/P/Cycles. Model retrofit: Model.step() now delegates to m.Source.Step(); R key, r key (run/pause), and b key (breakpoint toggle) all gate on m.Source.Attached() and route through Continue / Pause / SetBreakpoints when remote. m.scheduleTick's run-loop skips local stepping in remote mode — the server's stopped event flips m.Running back off via a tea.Cmd that drains m.Source.Events(). cmd/chippy -dap-attach tcp:HOST:PORT now opens the full TUI against the remote: dial → initialize → attach → build mirror CPU + RAM → wrap with RemoteSourcetea.NewProgram. Phase A's smoke-test short-circuit is replaced. Closes #180. Tests cover Step/SetBreakpoints/Attached/Address against a real dap.Server over net.Pipe; TestRemoteSource_Step_SyncsMirrorFromServer verifies one wire step advances mirror PC + A to match server state.

  • Source/disasm dual-mode scroll (issue #227): the [ / ] / { / } / ' keys now route to the source panel when ShowSource=true, otherwise to the disasm panel (existing behavior). New SourceFollow (default true) + SourceAnchorFile + SourceAnchorLine mirror the disasm pattern: first scroll pins the anchor from the current PC's mapped source line and flips follow off; subsequent scrolls move the anchor by ±1 / ±8 lines, clamped to [1, len(lines)]. ' restores follow mode. sourceView now picks centerFile/centerLine from anchor when pinned and still keeps the 👉 PC marker visible if PC's mapped file matches the centered file — so the user can scroll around without losing track of where execution is. Tests under internal/tui/source_scroll_test.go exercise the pin-on-first-scroll flip, anchor stepping, end-of-file clamps, pinned title hint, and follow restoration.

  • VHS smoke tests (issue #231): new test/smoke/ directory plus a root Makefile with make smoke / make smoke-all / make smoke-clean targets. Each .tape script drives chippy through a real TTY via charmbracelet/vhs, renders a .gif, and lets reviewers scrub the recording on PRs. Initial coverage: chippy-source-scroll (#227), chippy-syms (#226), chippy-bp-and-run (#225), nessy-attach (#220 / #224). A new smoke CI job on ubuntu-latest installs ttyd + ffmpeg, runs make -C test/smoke chippy, and uploads test/smoke/out/*.gif as workflow artifacts; the nessy tape is local-only because nessy needs X11 / GL dev headers the regular CI runners don't carry. CI gate is exit-code-only — a tape that crashes mid-render fails the job, but visual regressions still need human eyes on the artifact.
  • v0.3 demo ROM suite extension (issue #250, partial): three new homemade demos exercise the v0.3 audio path end-to-end. roms/demos/triangle-arpeggio/ cycles an A-major arpeggio (A4 / C#5 / E5) on the triangle channel, advancing one note every ~0.5 s from the NMI handler. roms/demos/noise-drum/ alternates the noise channel between low + high period indices every ~0.25 s, driving the LFSR feedback path. roms/demos/all-channels/ is a static kitchen-sink — pulse 1 + pulse 2 + triangle + noise all at fixed pitches — used to validate the non-linear DAC mixer (#249) under multi-channel load. Each demo ships pre-built .nes + .dbg; the Makefile's DEMOS list extends accordingly. New Pulse1LengthCounter() / Pulse2LengthCounter() / TriangleLengthCounter() / NoiseLengthCounter() accessors on apu.APU so headless cmd/nessy/demo_v03_audio_test.go can assert "channel still active" without touching internal fields. The all-channels test asserts every length counter survived the 30-frame run. DMC + MMC1 demos deferred — DMC needs embedded sample bytes (out of scope for this PR's size); MMC1 demo needs a multi-bank PRG layout that doesn't match the existing single-bank .cfg. File as follow-ups.
  • Non-linear DAC mixer (issue #249, nessy v0.3): replaces v0.2/v0.3-early linear pulse1+pulse2+(...) approximation with the nesdev DAC formulas. New internal/nes/apu/mixer.go exposes mixSample(p1, p2, tri, noi, dmc) float32 returning the combined output in [0, ~1.0]. Pulse term uses a 31-entry precomputed pulseTable (95.88 / ((8128/(p1+p2)) + 100)); tnd term evaluates the formula 159.79 / ((1 / (t/8227 + n/12241 + d/22638)) + 100) inline (3D LUT would be 32 KiB and the per-sample divide cost is trivial). emitSample now scales the float by 30000 to land in int16 with headroom — peak combined signal lands near 1.0 so no clipping. Tests under internal/nes/apu/mixer_test.go cover silent-zero, pulse-table monotonicity + ~0.258 peak, pulse vs triangle distinct levels at equal volume, combined > single, and no-int16-clipping at max input.
  • MMC1 mapper (issue #248, nessy v0.3): new internal/nes/cart/mmc1.go implements mapper 1 — the first non-NROM mapper, unlocks Zelda 1, Final Fantasy, Metroid, Castlevania II, Dragon Warrior, and most other early big-cart titles. Serial-shift-register write protocol on $8000-$FFFF: each write shifts bit 0 into a 5-bit accumulator; fifth write commits to one of four internal registers based on destination address bits 13-14 (control / chrBank0 / chrBank1 / prgBank). Bit-7 write resets the shift register + ORs control bits 2-3 (forces PRG mode 3). Four PRG modes (32 KiB switch / fixed-first / fixed-last) + two CHR modes (8 KiB switch / two 4 KiB switches). Optional 8 KiB PRG-RAM at $6000-$7FFF. Mirroring control via control bits 0-1 — four modes including new MirrorSingleLower / MirrorSingleUpper added to internal/nes/ines.go (PPU bus.go now handles them by routing every logical nametable to physical bank 0 or 1). Cart factory cart.Open dispatches mapper=1 to NewMMC1. Powers up in PRG mode 3 per nesdev. Tests under internal/nes/cart/mmc1_test.go cover power-on state, 5-write serial commit, bit-7 reset, all four PRG modes, 4-KiB CHR mode, mirroring runtime change, PRG-RAM round-trip, and factory dispatch. nestest + perfgate + lint all green. Out of scope: MMC1 "consecutive cycle" bug (RMW double-write suppression), SOROM/SUROM sub-mappers, battery-backed PRG-RAM persistence.
  • APU DMC channel + DMA stall reuse (issue #246, nessy v0.3): new dmcChannel under internal/nes/apu/dmc.go plus dmcRateLUT in tables.go. Implements $4010-$4013 — IRQ enable + loop + rate index ($4010), direct 7-bit output level ($4011), sample base address $C000 + (v*64) ($4012), sample length (v*16)+1 ($4013). Each timer expiry shifts one bit from the 8-bit shift register; bit=1 nudges level up by 2 (clamped 125), bit=0 down by 2 (clamped 2). Sample-buffer empty + bytes-remaining > 0 triggers DMA: charges cpu.Stall(4) (via new DMCStaller interface) + reads one byte from CPU bus (via new DMCBus interface). Wrap at $FFFF$8000 per nesdev. On bytes-remaining-reaches-zero: if loop bit set, reload from $4012/$4013 base; else if IRQ-enable set, assert apu-dmc source on the IRQ sink. $4015 bit 4 enable (reload-if-zero pattern, doesn't restart mid-sample) + bit 4 read (bytes > 0) + bit 7 read (DMC IRQ pending). $4015 read clears DMC IRQ flag and drops the sink assertion (per nesdev — one read acks both frame + DMC IRQ). $4010 write with bit 7 = 0 also clears the DMC IRQ. $4011 direct writes survive even with channel disabled — the "audio thump" pattern. Mixer extended to pulse1 + pulse2 + triangle + noise + DMC; DMC scaled at 40/sample so peak (127*40 ≈ 5080) sits in the same range as the other channels. cmd/nessy/wiring.go adds ap.SetDMCBus(mmio, processor) after CPU exists. Tests cover enable/load, direct-write override, base-address fetch + 4-cycle stall, loop reload, IRQ assert on exhaustion, $4015 read clears DMC IRQ, $4010 bit-7-clear clears DMC IRQ. nestest + perfgate + lint all green.
  • APU noise channel (issue #245, nessy v0.3): new noiseChannel under internal/nes/apu/noise.go. Implements $400C-$400F — envelope unit (same as pulse), length counter, 15-bit LFSR clocked at one of 16 NTSC periods (table at noisePeriodLUT). Two LFSR modes per $400E bit 7: mode 0 (long, 32767-step) uses bit0 XOR bit1 feedback; mode 1 (short, 93-step) uses bit0 XOR bit6 feedback. Output is the envelope volume gated by the LFSR's low bit. LFSR initializes to 1 (defensive guard reseeds 1 if a test pokes it to 0). Pulse + noise share the half-CPU-rate timer slot in stepCPU. Mixer extends to pulse1 + pulse2 + triangle + noise with the same 333/sample scale per non-pulse channel. $4015 bit 3 enable + read added; disable clears length. Tests under internal/nes/apu/noise_test.go cover enable/status, distinct-state count over 32 long-mode clocks, short-mode 93-step period detection, length-counter silencing, envelope decay, disabled silence, and mixer contribution. nestest + perfgate + lint all green.
  • APU triangle channel (issue #244, nessy v0.3): new triangleChannel under internal/nes/apu/triangle.go. Implements the standard NES $4008-$400B register file — linear counter (control + reload + value) at $4008, period low / high at $400A / $400B, length counter LUT-indexed via $400B bits 3-7. Linear counter clocked on quarter-frame ticks: reload flag set by $400B writes, cleared by next q-tick when control bit is clear; control=1 latches the flag so the counter never drains (loop mode). Length counter clocked on half-frame ticks, halted by the control bit. Period timer ticks every CPU cycle (not every other like pulse) so audible-range frequencies stay reachable despite the 32-step sequence. Sequencer freezes when length=0 OR linear=0 OR period<2 (silence quirk matches most emulators; real silicon emits inaudible buzz at <2). Mixer now sums pulse1 + pulse2 + triangle; triangle scaled 333/sample so peak triangle ≈ peak pulse pair (linear stand-in; non-linear DAC LUT is #249). $4015 bit 2 enable + read added; disable clears the length counter per nesdev. Tests under internal/nes/apu/triangle_test.go cover length load + status, linear-counter reload + decrement, sequencer advancement on timer tick, period<2 silence, disabled-channel silence, and mixer contribution. nestest + perfgate + lint all green.
  • CPU multi-source IRQ + APU frame-counter IRQ (issue #247, nessy v0.3): CPU gains AssertIRQSource(name) / ClearIRQSource(name) over an internal set of asserted sources; irqLine reflects a wired-OR of every active source. Existing single-source AssertIRQ() / ReleaseIRQ() remain as anonymous-source wrappers (source "") so the v0.1 / v0.2 tests don't regress. APU gains an IRQSink interface + SetIRQSink() + a frameIRQFlag field; 4-step mode fires the named IRQ source "apu-frame" at the end of each step-3 (not inhibited). $4015 read returns the flag in bit 6 and clears both the flag and the source. $4017 inhibit set clears any pending IRQ immediately. 5-step mode never fires the frame IRQ. cmd/nessy/wiring.go wires ap.SetIRQSink(processor) after both APU + CPU exist. Tests under internal/cpu/irq_sources_test.go cover OR semantics, idempotent assertion, anonymous + named source coexistence, and ghost-clear no-op; tests under internal/nes/apu/irq_test.go cover frame IRQ firing in 4-step, inhibit-clears-pending-and-prevents, 5-step skipping, $4015-read-clears, and headless-flag tracking. DMC IRQ wiring lands with the DMC channel itself (#246).
  • nessy Ebiten audio sink (closes #207 follow-up): new cmd/nessy/audio.go (nessy build tag) wires the APU's int16 sample ring into an Ebiten audio.Player. apuStream implements io.Reader, draining APU.Samples() under cpuMu, duplicating each mono sample across L/R, and padding short reads with silence so the audio thread never blocks on a slow CPU. audioSink owns the context + player; nil-safe so -mute (or a failed init on a headless host) cleanly skips playback without affecting CPU/PPU. game gains an audio *audioSink field; main constructs the sink after building the bus and calls sink.start() before ebiten.RunGame. The APU keeps emitting samples regardless of mute state — only the player goes away — so save-state APU coverage works consistently.
  • APU pulse 1 + pulse 2 + frame counter (issue #207, nessy v0.2): new internal/nes/apu/ package. APU is a cpu.Peripheral claiming $4000-$4013 (channel registers); StatusPeripheral wraps $4015 so the discontiguous APU surface dodges the OAMDMA $4014 window without forcing an MMIO refactor. Each pulseChannel (1 and 2) implements duty selector ($4000/4 bits 6-7), envelope unit (decay + constant + loop), sweep unit (period adjustment with the pulse-1 vs pulse-2 negate-mode difference), length counter (32-entry LUT), and the 11-bit period timer. The frame counter sits behind SetFrameCounter — joypad.Port now exposes AttachFrameCounter(FrameCounterSink) so $4017 writes (which fall inside joypad's existing $4016-$4017 range) forward to the APU instead of being dropped. 4-step (240 Hz, default) and 5-step (immediate quarter+half tick on write) modes both supported; IRQ assertion deferred until v0.3 (no peripheral IRQ pump wired yet). Sample emission uses a fractional cycle-per-sample accumulator pinned to 44.1 kHz so the int16 ring stays locked over long horizons; linear pulse1+pulse2 mixer (real-silicon's non-linear DAC LUT is a v0.x quality knob). cmd/nessy/wiring.go registers APU + StatusPeripheral alongside cart + joypad + PPU + OAMDMA (6 peripherals total now). Tests under internal/nes/apu/apu_test.go cover Range split, $4015 disable clearing length, 5-step immediate tick, zero-crossing count matching expected pulse frequency (FFT-equivalent assertion), length-counter silencing, envelope decay, channel-enable gating, and the period<8 sweep-mute path. nestest + perfgate + Klaus + lint all green. Out of scope: Ebiten audio sink wiring (a host-side shim in cmd/nessy lands in a follow-up so headless tests stay process-bound-free), triangle/noise/DMC channels (v0.3), nesdev DAC mixer LUT, IRQ assertion through MMIO, save-state APU-restore-with-audio.
  • Mid-frame scrolling (issue #206, nessy v0.2): PPU rendering goes per-scanline (was per-frame). New scrollSnapshot {scanline, scrollX, scrollY, baseNametable} captures $2000 (nametable bits) + $2005 + $2006 writes during visible scanlines via recordScrollChange(); stepDot snapshots frameStartScroll when scanline rolls back to 0 (= scroll state at the start of the next frame, set during the just-ended vblank). renderFrame walks scanlines 0..239, advancing through scrollEvents and rendering each row with the active snapshot's scroll. New renderScanline(y, snap) rasterizes one row: per-pixel apply scrollX/scrollY, wrap into adjacent nametables at the 256 / 240 boundaries (PPUCTRL bits 0/1 XOR), then standard nametable + attribute + pattern fetch. A tiny single-tile cache cuts per-pixel busReads from 4 down to 1 within a tile row. scrollFromV() derives scroll values from the 15-bit v latch (nesdev "loopy" layout) so $2006 mid-frame pairs (the SMB1 split mechanism) produce correct snapshots. Tests under internal/nes/ppu/scroll_test.go cover horizontal scroll shifting pixels, mid-frame split rendering top-32 rows with frame-start scroll + remainder with mid-frame scroll, horizontal nametable wrap, frame-start snapshot capture from stepDot, vblank-write filtering, and visible-write capture. v0.1 demos (hello-bg, input-echo, vblank-bounce) still render bit-identically (SHA-pinned tests unchanged). Out of scope: cycle-accurate per-dot v/t/x/w (fine-X bit slide, dot-257 horizontal copy, pre-render Y reload at dots 280-304) — per-scanline + per-tile fetch handles SMB1-class games, the per-dot work is a v0.x stretch.
  • Sprite rendering (issue #205, nessy v0.2): the PPU's per-frame renderFrame now seeds a bgOpaque [256*240]bool mask alongside the RGBA framebuffer, then a new renderSprites (in internal/nes/ppu/sprites.go) composites the sprite layer on top. OAM walk visits all 64 sprites in index order — lower OAM index wins on per-pixel collision (real-silicon priority). Each sprite reads attr bits 0-1 (palette select), bit 5 (priority behind BG), bit 6 (hflip), bit 7 (vflip); 8×16 mode (PPUCTRL bit 5) doubles the per-sprite row count and routes the pattern table via tile-index bit 0 instead of PPUCTRL bit 3. Sprite-0 hit ($2002 bit 6) latches whenever any opaque sprite-0 pixel coincides with an opaque BG pixel — gated by both BG show + sprite show. Sprite overflow ($2002 bit 5) sets when any visible scanline crosses more than 8 sprites; the v0.2 implementation is the simple correct version, not the silicon "bug". Compositor still triggers from the stepDot vblank-entry path so v0.1 BG-only demos (hello-bg, input-echo, vblank-bounce) render bit-identically (SHA-pinned tests unchanged). Tests under internal/nes/ppu/sprites_test.go cover single-sprite render, sprite-0 hit fires/doesn't-fire, overflow at 9/8 sprites/scanline, priority-behind-BG, 8×16 mode bottom half, lower-OAM-index priority winning, and sprite-show-disabled suppressing all sprite-side effects. nestest + perfgate + lint all green. Out of scope: cycle-accurate sprite-0 hit timing, sprite-overflow silicon bug, color emphasis interactions, $2007-during-rendering quirks.
  • nessy demo: sunsoft5b-chord (issue #325, nessy v0.7): new roms/demos/sunsoft5b-chord/ — Sunsoft FME-7 (mapper 69) cart + 5B audio half. Init sequence latches register addresses via $C000 + commits data via $E000 for: R7 mixer ($F8 = enable tones A/B/C, disable noise), R0-R5 tone periods (A=$80, B=$A0, C=$C0), R8-R10 amplitudes ($0C fixed level each). Validates the v0.6 Sunsoft 5B audio path (#306) end-to-end. Cart layout = 16 KiB PRG (2 × 8 KiB) for mapper 69. iNES header: flag6 high nibble = 5, flag7 high nibble = 4. Headless test asserts the APU has the 5B chip wired + non-zero samples in the ring.
  • nessy demo: vrc6-chord (issue #324, nessy v0.7): new roms/demos/vrc6-chord/ — programs all three VRC6 audio channels (2 pulse + 1 sawtooth) at distinct frequencies and spins. Headless test asserts the APU has the VRC6 audio chip wired + emits non-zero samples. Cart layout = 16 KiB total PRG (2 × 8 KiB) for mapper 24; the 16K switchable window at $8000-$BFFF maps bank 0 (unused), the fixed last 8 KiB at $E000-$FFFF carries code + vectors. iNES header gotcha: mapper-24 requires flag6 high nibble = 8 and flag7 high nibble = 1; getting flag7 wrong silently dispatches to mapper 8 instead.
  • nessy demo: state-counter (issue #327, nessy v0.7): new roms/demos/state-counter/ — dedicated save-state round-trip probe. NMI handler increments a zero-page counter (frame_cnt) + writes the byte to $3F00 (universal BG colour). Each frame the framebuffer fills with paletteRGB(frame_cnt & 0x3F) — colour is a 1:1 function of NMI count. Test TestDemo_StateCounter_SaveRoundTrip: boot, advance 30 frames, capture state; build fresh bus, advance 80 frames (intentionally divergent), apply state, assert post-restore zero-page frame_cnt + framebuffer both match the reference. Any save-state subsystem regression (CPU regs, zero-page RAM, PPU palette, NMI latch, frame counter) breaks the check. Complements the existing TestSaveState_RoundTrip_EndToEnd (which uses vblank-bounce) by exercising a more obviously state-dependent ROM.
  • nessy demo: oam-grid (issue #326, nessy v0.7): new roms/demos/oam-grid/ — exercises $4014 OAMDMA + the full 64-sprite OAM walk under a SHA-stable regression. Reset code seeds $0200-$02FF with 64 sprite records (8 × 8 grid of tile $30, centred at (88, 88)), writes $4014 = $02 to seed the PPU's OAM, enables sprite show, enables NMI. NMI handler does STA $4014 with A=$02 each frame so OAM keeps fresh data after every vblank — canonical NES pattern. CHR-ROM hand-rolls tile $30 as a solid 8×8 white square (plane 0 = $FF rows, plane 1 = $00 rows) so the rendered grid pixels are unmistakable. Test asserts OAM[0] (sprite-0 Y) + OAM[$FF] (sprite-63 X) + non-zero framebuffer pixels — deliberately avoids full SHA pinning because sprite-renderer tweaks legitimately shift per-pixel bytes without indicating a regression.
  • nessy demo: scroll-split (issue #328, nessy v0.7/v0.8): new roms/demos/scroll-split/ — mid-frame horizontal scroll split. Background is vertical 8px stripes (alternating blank/solid columns); the main loop sets scroll-X=0 at the top of the visible frame, busy-waits ~half a frame (cycle-counted nested dey/dex loop ≈ scanline 120), then rewrites scroll-X=8 via $2005 mid-render. The per-scanline renderer (#206) captures the mid-frame write through recordScrollChange, so the top half draws at scroll 0 + the bottom at scroll 8 — a visible split. Cycle-timed rather than sprite-0-hit, which is deterministic under nessy's fixed cycle model so the framebuffer is stable. Test asserts a top scanline's pixel row differs from a bottom one (split took effect) + two rows in the same region match (stripes vertically uniform within a scroll region).
  • DAP wire-transcript golden tests (issue #169, nessy v0.8): new internal/dap/transcript_test.go drives a real in-process dap.Server through committed request scenarios + diffs the decoded reply stream against a golden. Each scenario is a JSON array of client requests (testdata/dap-transcripts/<name>.json); the harness frames them into the Content-Length wire stream, runs Serve(), decodes the framed replies back into a flat message list, re-marshals each compactly (one per line), and compares to <name>.golden. -update regenerates goldens. Decoded-message comparison (not raw Content-Length bytes) keeps goldens diff-friendly + immune to header churn while still pinning every field the client sees — catches capability drift, response/event ordering, request_seq echo, sequencing off-by-ones that per-handler unit tests miss. Scenarios are path-free (no program launch) so goldens are portable across machines + CI. Shipped: handshake (initialize → threads → disconnect — the capabilities response is the headline drift net) + exception-breakpoints (initialize → setExceptionBreakpoints brk → clear → disconnect). Capture tooling (cmd/dap-record) from the issue is deferred — the in-process harness covers the regression need without a passthrough recorder.
  • VRC7 OPLL FM synth (issue #315, nessy v0.8): filled the silent v0.6 VRC7 audio stub with a real Yamaha YM2413 (OPLL) FM synth — Lagrange Point's soundtrack now plays. apu/vrc7.go's VRC7Audio keeps its type + $9010/$9030 register port (cart wiring unchanged) but now models 6 melodic channels, each a 2-operator FM voice (modulator → carrier) with a per-operator ADSR envelope, driven from the 15-entry fixed instrument patch ROM (the published OPLL table) + a user patch (regs $00-$07). Register decode: $10-$15 F-number low, $20-$25 F-number bit 8 + block + key-on + sustain, $30-$35 instrument + volume. FM math: phase + envelope advance once per emitted sample (Output called at SampleRate from emitSample), float sine FM with modulator self-feedback + modulation index from the modulator total level; output folds into the APU mix like the other expansions. Deliberately simplified vs cycle-exact OPLL: per-sample (not per-operator-slot) stepping, float sine instead of the log/exp LUT pipeline, simplified ADSR rate deltas, KSL + vibrato/tremolo depth omitted — enough for an audible, recognisable rendition. Tests: idle silence, key-on → FM output, key-off → decay to silence, patch ROM populated. The v0.6 cart shell + existing SetVRC7Audio wiring needed no change.
  • PPU fine-X bit slide (issue #282, nessy v0.8): scrollFromV was dropping the fine-X latch — p.scrollX = coarseX * 8 discarded the 3-bit sub-tile offset that lives in p.x (set by the first $2005 write). The $2005 scroll path already folded fine-X in (stores the full byte), but a $2006-derived scroll change (SMB1's mid-frame split mechanism) left fine-X behind, so sub-tile horizontal scroll snapped to 8-pixel boundaries. Fix: p.scrollX = coarseX*8 + p.x for fractional-scroll precision. Games that never touch $2005 keep p.x == 0, so every static demo renders byte-identically (SHAs unchanged). Tests: scrollFromV folds fineX + a render-level bit-slide proof (a pixel near a tile boundary that's black at scroll 0 slides into the next tile at fine-X scroll 3). The issue's "retire scrollSnapshot / full v-driven per-dot render" is a non-functional refactor deferred — the fine-X correctness it chased is now fixed via the snapshot path.
  • nesdev / Blargg accuracy test ROM suite (issue #318, nessy v0.8): new build-tagged (accuracy) cmd/nessy/accuracy_test.go runs real nesdev test ROMs through the full iNES → cart → MMIO → CPU + PPU + APU integration + checks the documented pass/fail signal. Blargg protocol: $6000 status ($80 running, $81 awaiting-reset, <$80 done; 0 = pass), $6001-$6003 magic $DE $B0 $61 gating when $6000 is trustworthy, $6004+ null-terminated result text. runBlargg steps frame-by-frame until status finishes (or a frame cap), reads status + text via cart.CPURead (MMC1's $6000 PRG-RAM). ROMs download + cache + SHA-pin on first run (mirrors nestest_test.go); *_BIN env overrides a local copy. New accuracy CI job. First ROM wired: ppu_vbl_nmi.nes — immediately surfaced a real gap (test 2 02-vbl_set_time fails: our PPU sets the vblank flag as a discrete scanline-241-dot-1 event, doesn't model the $2002-read-vs-set sub-dot race). Tracked as #342; the ROM carries a knownFail tag so the harness logs the gap + skips instead of hard-failing — CI stays green, gap stays visible. The job gates on "harness ran + read a status"; a regression in a passing ROM still fails it. First instalment; more ROMs add as validated.
  • PAL / Dendy timing variant (issue #320, nessy v0.8): per-region clock + frame geometry replacing the hard-coded NTSC constants. New internal/nes/timing.go defines a Timing struct (CPU clock Hz, FPS, cycles/frame, dots/scanline, scanlines/frame, vblank + pre-render scanline, odd-frame-skip flag, APU quarter-frame step) + NTSC / PAL / Dendy tables + TimingFor(TVSystem) (TVDual + unknown → NTSC). PPU + APU each hold a timing field defaulting to nes.NTSC so every existing demo + test renders byte-identically; SetRegion swaps it. PPU stepDot reads p.timing.{DotsPerScanline,ScanlinesPerFrame,VBlankScanline,PreRenderScanline,OddFrameSkip} (PAL = 312 lines / no dot-skip; NTSC + Dendy = 262). APU stepCPU + advanceFrameStep + sample accumulator read a.{cpuClockHz,quarterFrameCycles} (PAL clock 1.662607 MHz, quarter step ≈8313; Dendy 1.773448 MHz). buildNES derives region from rom.TVSystem + calls SetRegion; the game loop's budget reads bus.timing.CyclesPerFrame and ebiten.SetTPS(bus.timing.FPS) so PAL/Dendy run at 50 Hz wall-clock. Package-level NTSC consts stay as the values PPU/APU tests pin against. Tests: TimingFor mapping + geometry invariants + PPU PAL frame wraps at 312 scanlines + vblank fires at 241 under PAL + NTSC default unchanged. Out of scope: host display still refreshes at Ebiten TPS — internal game-state + audio cadence is region-correct (the audible part). First v0.8 item.
  • nessy headless GIF/MP4 recorder (issue #339, nessy v0.7): new cmd/nessy-record — captures a ROM run as a GIF or MP4 (video + audio + scripted input) with no Ebiten window, no OpenGL, no screen grab. Synthesizes the recording straight from the emulator: each frame's ppu.FrameBuffer() → video frame, each frame's drained apu.Samples() → audio track, a JSON input-script timeline ({"30":["A"],"33":[]} keyframe model) drives the joypad. Deterministic — same ROM + script → byte-identical output, so it's a stable CI artifact with no timing flake. GIF path is stdlib image/gif (NES ≤64 colours → palette built from frames, no quantisation; delay 2cs ≈ 50fps since 60fps isn't representable in centiseconds). MP4 path shells to ffmpeg: raw RGBA frames at 60fps + s16le mono PCM at the APU rate → H.264/AAC, 2× nearest-neighbor upscale. Bus wiring duplicated a third time (after cmd/nessy + cmd/nessy-wasm) — refactor into internal/nesbus is the standing cleanup. make -C test/smoke nessy-record renders vblank-bounce (visual GIF) + oam-grid (sprite GIF) + input-echo (scripted-input GIF) + all-channels (audio MP4); the CI smoke job runs it, uploads .gif/.mp4 artifacts, and embeds them in the sticky PR comment. Tests cover GIF validity + determinism (two runs byte-identical), MP4 (ffmpeg-gated skip), script parse + unknown-button rejection.
  • PPU sprite-overflow silicon bug (issue #283, nessy v0.7): replaced the simple-correct per-scanline sprite count with the 2C02's buggy evaluator. New evaluateSpriteOverflow(y, spriteH) walks OAM with the real silicon's n (sprite index) + m (byte index) pair: while fewer than 8 sprites are found, m stays 0 (Y byte read per sprite); once 8 are found, a NOT-in-range result increments BOTH n and m instead of resetting m — the floating m reads tile / attribute / X bytes as if they were Y, producing the hardware-specific false positives + false negatives Battletoads-class games rely on. Both render paths call it: compositeScanlineSprites (per-scanline, the live path) once per visible scanline, and the legacy per-frame renderSprites loops it across all scanlines so direct-call tests see identical flag behaviour. Sprite painting stays simple-correct (renders all in-range sprites) — only the overflow FLAG uses the buggy evaluator, matching how real silicon drives the flag off the evaluator while the compositor uses secondary OAM. Tests: a crafted 8-in-range + drifted-m false positive, an 8-in-range + off-screen-rest no-false-positive, and a 9-genuinely-in-range true positive. Existing 9-vs-8 overflow tests still pass.
  • nessy gamepad support (issue #321, nessy v0.7): new cmd/nessy/gamepad.go polls any standard-layout controller via Ebiten's IsStandardGamepadLayoutAvailable + IsStandardGamepadButtonPressed + StandardGamepadAxisValue. Button mapping per the v0.7 issue: D-pad → D-pad, RightBottom → NES A, RightRight → NES B, CenterRight → Start, CenterLeft → Select. Left analog stick ORs into D-pad past a 0.5 deadzone. First connected pad with a standard layout drives P1; others ignored. Per-frame connect / disconnect transitions log to stderr via the gamepadConnState.prev map so notifications fire once per change instead of every frame. Routing is additive — keyboard and gamepad inputs OR together so a player can mix both. Pads without a standard layout silently fall through (per-pad mapping path is a future issue).
  • UNROM bus-conflict variant (issue #319, nessy v0.7): real UNROM silicon ANDs the written bank-select byte with the ROM byte at the same address before committing. Most authored ROMs work around the quirk by writing through pages reading $FF, but a handful test the conflict explicitly. New busConfl bool on *UxROM set from iNES 2.0 sub-mapper 2; CPUWrite does v &= CPURead(addr) when set. UxROMState.BusConfl added for save-state persistence. Tests cover both branches with a synthesised ROM whose last bank reads $03 everywhere.
  • VRC7 cart (mapper 85) (issue #303, nessy v0.6 partial): Konami's mapper 85 — the cart that packages a VRC4-shape PRG/CHR/IRQ surface with a Yamaha YM2413 (OPLL) FM-synth audio expansion. Lagrange Point is the only commercial NES VRC7 release. v0.6 ships the cart side; OPLL FM synth split to v0.7 follow-up (#315). PRG: three switchable 8 KiB windows at $8000 / $A000 / $C000 plus a fixed last 8 KiB at $E000. CHR: eight switchable 1 KiB banks. Address-line routing uses address bit 4 ($X000 vs $X010) for sub-register select within each class; audio port pair is the exception at $9010 (register select) + $9030 (data write). $E000 controls mirroring (bits 0-1) + WRAM enable (bit 7); writes with bit 7 clear silently drop CPU writes to $6000-$7FFF. IRQ counter identical-shape to VRC4/VRC6 with the vrc7 source name. Audio writes forward through cart.VRC7AudioSink; apu.VRC7Audio is the v0.6 stub that captures the 64 OPLL register file but returns Output() = 0. Cart-state union extends with VRC7State. cmd/nessy/wiring.go + cmd/nessy-wasm/main.go both detect VRC7 via SetAudioSink(cart.VRC7AudioSink) type-assertion and pair cart + stub. Tests cover PRG bank routing, mirroring + WRAM enable/disable, IRQ assert+ack, audio forwarding, mapper-85 dispatch, save/restore round-trip + the stub's register capture + silence guarantee. Lagrange Point now LOADS + plays gameplay with silent soundtrack.
  • nessy WASM playground (issue #210, nessy v0.6): new cmd/nessy-wasm/ — Ebiten js/wasm entrypoint that boots into an embedded default ROM and exposes a nessy.loadROM(Uint8Array) JS hook so visitors can drag-drop their own .nes files. Build via GOOS=js GOARCH=wasm go build -o web/nessy/nessy.wasm ./cmd/nessy-wasm. Page shell at web/nessy/index.html wires up the file picker + canvas mount + keyboard routing (Arrows / Z / X / Enter / RShift = D-pad / A / B / Start / Select). cartPeripheral + buildBus are intentionally duplicated inline in the wasm main rather than imported from cmd/nessy/wiring.go — the two package main modules can't share types without a non-trivial refactor; the duplication stays narrow + the surface is verified by the existing wasm-build CI job. Audio output deferred: Ebiten js/wasm needs a user-gesture unlock for the Web Audio context, and the existing audio decouple (cpuMu drain → push to oto Player) doesn't apply on the wasm path. apu.Samples() still drains the ring each frame so the buffer doesn't saturate. Pages workflow extends with the nessy build step + wasm_exec.js copy; published URL becomes chippy.dev/playground/nessy/. Out of scope here (v0.x stretch): localStorage save states, audio output, debug overlay.
  • VRC6 mapper + audio expansion (issue #302, nessy v0.6): Konami's mapper 24 / 26 (VRC6a / VRC6b — sub-bit pinout swap) ships a dedicated 3-channel audio expansion (2 pulse + 1 sawtooth) on top of a VRC4-style PRG/CHR/IRQ surface. Cart layout: 16 KiB switchable at $8000, 8 KiB switchable at $C000, fixed last 8 KiB at $E000; eight 1 KiB CHR banks via commands at $D000-$Exxx. Mirroring control via $B003 bits 2-3. IRQ counter is identical-shape to VRC4 (8-bit + CPU / scanline mode + $F000-$F002 for latch/control/ack). Audio register writes at $9000-$9002 / $A000-$A002 / $B000-$B002 forward through a cart.VRC6AudioSink to the new apu.VRC6Audio chip. Pulse channels offer 7 duty selectors (vs the 2A03's 4) + a "mode" bit forcing constant output at the volume level; sawtooth uses an 8-bit accumulator that adds a 6-bit rate twice per period across a 14-step cycle, with the high 5 bits of the accumulator emitted as the output. Cart-state union extends with VRC6State. cmd/nessy/wiring.go detects VRC6 via SetAudioSink(cart.VRC6AudioSink) type-assertion and pairs cart + chip + APU mixer. Tests cover PRG 16+8 bank routing, fixed-last tail, mirroring matrix, IRQ assert+ack, audio-write forwarding to logical addresses, VRC6b sub-bit swap, mapper-24/26 dispatch, save/restore round-trip + audio chip period commit + pulse emission + sawtooth accumulation.
  • Sunsoft 5B audio expansion (issue #306, nessy v0.6): the audio half of the FME-7 mapper package (#270 shipped the bank-switching side). YM2149 / PSG clone — three tone channels (12-bit period each), per-channel amplitude with envelope-follow bit, mixer, plus an envelope generator. Gimmick! is the headliner. Cart-side register interface forwards from FME-7's $C000 / $E000 port pair via new cart.Sunsoft5BSink (Write(addr, v byte)). apu.Sunsoft5B satisfies it directly. APU adds optional sunsoft5b *Sunsoft5B field + SetSunsoft5B(*) + folds the chip's Output() (0..45) into emitSample as a scaled int16 addend so non-FME7 carts pay zero cost. YM2149 prescaler approximated via a CPU-rate / 16 divider so tone frequencies land within ~1% of silicon. Out of scope for the minimum-viable v0.6: the noise LFSR (register R6 captured but the channel is silent — Gimmick! doesn't lean heavily on noise) and exotic envelope shapes (sawtooth shape 0xE + triangle shape 0xA are modelled; the rest hold). cmd/nessy/wiring.go detects FME-7 carts via SetAudioSink type-assertion and pairs the cart + chip + APU. Tests cover port routing, period commit, tone toggling, mixer-disable silence.
  • DMC / OAMDMA bus contention (issue #300, nessy v0.6): real silicon has a single DMA controller; when a DMC sample fetch fires inside an in-flight OAMDMA window, the DMC fetch waits for an OAMDMA byte slot and pays a 2-cycle alignment penalty. Detection via the new cpu.PendingStall() int accessor + extending the apu.DMCStaller interface with PendingStall(). In dmc.maybeRefill, the normal 4-cycle stall becomes 6 when the staller reports non-zero pending debt. Sample-perfect models per nesdev's dmc_dma_during_read need per-cycle accounting that chippy's instruction-boundary tick fan-out doesn't provide; this approximation is directionally correct + bounded — most shipping ROMs sit well outside the contention envelope. Test TestDMC_OAMDMAContentionAdds2Cycles baselines a non-contended run, then re-runs with pending=100 and asserts the per-fetch stall went 4→6.
  • nessy recent-ROMs CLI + controller config (issue #308, nessy v0.6): two daily-driver convenience hooks deferred from #273. cmd/nessy/recent.go keeps a 5-entry ~/.nessy/recent list (newest first, deduped). nessy with no args prints the numbered list to stderr + exits; nessy N (1..5) opens the Nth recent slot; a positional that looks like a path still opens the path verbatim. recordRecent writes the absolute path after each successful boot. cmd/nessy/controller.go loads ~/.nessy/controller.json (per-player p1 / p2 button → ebiten-key-name map), normalises identifiers (case + whitespace + dashes / underscores ignored), overlays the package-level keyMap before the game loop starts. Missing file is silent; malformed file warns + keeps defaults. Tests cover parseRecentSlot happy + reject paths, parseButton + normalize ergonomics. README documents both.
  • MMC3 RevA (NEC) IRQ variant (issue #301, nessy v0.6): MMC3 shipped in two silicon revisions. RevB (Sharp, default) is what v0.4 already implemented — explicit $C001 reload with latch=0 + IRQ enabled fires on the next A12 edge. RevA (NEC) silently loads the latch on explicit reload and skips the post-reload IRQ check; only the natural counter==0 → reload-and-fire path triggers. Klax wrote $C001 expecting NO IRQ; under RevB it would fire. New revA bool on *MMC3 set from iNES 2.0 sub-mapper 3; clockA12 dispatches to clockA12RevA when set. State + tests cover both branches: RevB fires on zero-latch reload, RevA stays silent, RevA still fires on natural countdown. MMC3State.RevA added for save-state persistence.
  • VRC2 / VRC4 mapper family (issue #269, nessy v0.5): new internal/nes/cart/vrc.go covers mappers 21, 22, 23, 25 behind a single shared core. Both chips share a register-class layout addressed at $8000 / $9000 / $A000 / $B000-$E000 / $F000; VRC4 adds a CPU-clock-counted IRQ at $F000-$FFFF that VRC2 leaves silent. The differentiator across the 6 chip variants Konami shipped is which CPU address bits encode the "sub-bits" (0-3) within each register class — per-mapper subBits() returns the sub-index from A0/A1, A1/A6, or A0/A1-swapped depending on the variant. iNES 2.0 sub-mapper hint (added in #271) picks between VRC2b/VRC4f on mapper 23 and VRC2c/VRC4b on mapper 25. PRG layout: switchable 8 KiB at $8000 + $A000, fixed second-to-last + last at $C000 + $E000; the VRC4 PRG mode bit at $9002 swaps $8000 with $C000. CHR layout: eight 1 KiB banks. VRC2a (mapper 22) shifts CHR bank values right by 1 because the chip only routes the upper 7 of the 8-bit bank index. VRC4 IRQ: 8-bit reload counter with two modes — scanline (341-cycle prescaler) and CPU (per-cycle); $F002 writes the control byte (enable + mode + enableAfterAck); $F003 acks pending and copies enableAfterAck → enable. Mirroring control via $9000 covers vertical / horizontal / single-screen-lower / single-screen-upper. Cart implements cpu.Ticker for the IRQ counter; cartPeripheral.Tick (added with FME-7) forwards. Cart-state union extends with VRCState for save / restore. Tests cover PRG bank routing + mode swap, mirroring, VRC4 CPU + scanline IRQ modes, VRC2 silent IRQ, mapper 22 CHR halving, all 4 mapper numbers dispatch through cart.Open, round-trip serialization. VRC6 audio expansion → #302; VRC7 mapper + OPLL FM synth → #303 (both v0.6).
  • PPU $2007 open-bus latch (issue #272, partial): real 2C02 silicon has a shared external PPU data bus; reads from write-only registers ($2000 / $2001 / $2003 / $2005 / $2006) return whatever byte last crossed the bus. PPUSTATUS reads OR live bits 7-5 with the latch's bits 4-0. Palette reads ($3F00-$3FFF) place only the 6-bit palette value on the bus; the upper 2 bits come from the latch. New openBus byte on *ppu.PPU: every Write updates it; reads from write-only addresses return it as-is; $2002 / $2004 / $2007 reads return their value AND refresh the latch with what just left the bus. Real silicon's DRAM-cell decay (~1 frame) isn't modelled — the latch holds indefinitely. internal/nes/ppu/open_bus_test.go covers the four shapes (write-only registers, status bits, palette upper-2-from-latch, OAM read latch update). State persistence: FullState.OpenBus field added. Third instalment of #272's cycle-accuracy hardening pass.
  • FME-7 / Sunsoft 5B mapper (issue #270, nessy v0.5): new internal/nes/cart/fme7.go implements mapper 69 — Sunsoft's late-NES mapper used by Gimmick! and Batman: Return of the Joker. PRG layout: four switchable 8 KiB windows at $6000 / $8000 / $A000 / $C000 plus a fixed last bank at $E000. CHR layout: eight switchable 1 KiB banks. Register interface is a command/parameter port pair — $8000 latches a 4-bit command, $A000 writes the parameter into the selected register (commands 0-7 = CHR banks, 8 = PRG-RAM enable + bank, 9-11 = PRG banks, 12 = mirroring, 13 = IRQ control, 14/15 = IRQ counter low/high). Mirroring control covers vertical / horizontal / single-screen-lower / single-screen-upper. IRQ: 16-bit down-counter ticks at CPU clock when command 13 bit 7 is set; on underflow, if bit 0 (IRQ enable) is set, asserts the named "fme7" source via the v0.3 multi-source pump. Writes to command 13 ack any pending IRQ. The cart implements cpu.Ticker so cartPeripheral in cmd/nessy/wiring.go forwards per-instruction cycle deltas via a type-assertion (also new — MMC3's A12 IRQ doesn't need this, but FME-7 + future VRC mappers do). Sunsoft 5B audio expansion (three YM2149-clone pulse channels) deferred to v0.6 — writes to the $C000 / $E000 audio port pair fall through as no-ops. Cart factory cart.Open dispatches mapper=69. Cart-state union extends with FME7State for save / restore. Tests: PRG bank routing, all four mirroring modes, IRQ underflow + ack semantics, counter-disable gating, CHR bank windowing, Open dispatch, save / restore round-trip.
  • nessy save states (issue #266, nessy v0.4): full snapshot + restore surface across the bus. Each subsystem ships its own FullState struct + SaveFullState() / LoadFullState() pair: internal/cpu (registers, IRQ sources, NMI latch, pending stall, variant — re-binds opcode table on restore), internal/cpu.RAM (full 64 KiB Data plus shadow-epoch reset), internal/nes/ppu (registers, latches, VRAM/OAM/palette, scanline/dot/frameCount, both framebuffers + bgOpaque mask, scroll-event history — uses displayMu so a Draw goroutine racing the load sees consistent video), internal/nes/apu (every channel's full register file + frame counter + sample accumulator — sample ring intentionally NOT persisted), internal/nes/joypad (controller shift + strobe state). Cart state is a discriminated union (cart.CartState) with per-mapper *State pointers; cart.SaveCart(c) / cart.LoadCart(c, s) dispatch by concrete type so every mapper (NROM / UxROM / CNROM / MMC1 / MMC3) ships full register + PRG-RAM + CHR-RAM coverage with mapper-kind-mismatch rejection. Composite envelope (cmd/nessy/savestate.go): nesSaveState{Magic, Version, ROMHash, CPU, RAM, PPU, APU, Joypad, Cart} gob-encoded then gzip-compressed (framebuffer-heavy state compresses 10×+). On disk: ~/.nessy/states/<rom-hash>-slot<N>.state. Hotkeys: F1–F4 save into slots 1–4; F5–F8 load. saveStateMgr queues save/load from the input poll; the game loop drains the queue under cpuMu right before stepping so capture / restore lands on an instruction boundary. ROM-hash tag prevents cross-game restore (a slot saved for Mario refuses to load into Zelda). Tests: per-package round-trips (CPU, RAM) + composite end-to-end (TestSaveState_RoundTrip_EndToEnd) boots vblank-bounce, saves at frame 60, advances a fresh bus 10 frames + restores + steps 30 more, asserts post-state PC + framebuffer match the reference exactly. TestSaveState_EncodeDecode_RoundTrip covers the gzip+gob path. Format version pinned at 1 — bump on any breaking schema change. v0.4 epic complete.
  • PPU odd-frame dot-skip (issue #272, partial): on NTSC with rendering enabled, the pre-render scanline (261) is one dot shorter on odd frames — the PPU jumps from dot 339 straight to (0,0) of the next frame, skipping dot 340. SMB1 and other long-running titles depend on this so the 240-line visible scroll stays in phase with the audio frame rate over the long horizon. stepDot now computes a dynamic boundary: dotsPerScanline - 1 when scanline == preRenderScanline && renderingEnabled() && frameCount&1 == 1, else dotsPerScanline. New internal/nes/ppu/odd_frame_test.go covers the three combinations (odd+rendering = skip, even+rendering = no skip, odd-no-rendering = no skip). Second instalment of #272's cycle-accuracy hardening pass.
  • OAMDMA odd-cycle alignment (issue #272, partial): real silicon charges 513 stall cycles for $4014 OAMDMA on an even-CPU-cycle write, 514 on an odd-cycle write (the bus-steal aligns on a read cycle, so an odd-cycle start eats one extra dummy cycle). The v0.2 implementation always charged 513. Now dma.OAMDMA.Write samples cpu.CPU.CurrentCycle() via the new CPUStaller.CurrentCycle() interface method, adds 1 to the stall when cycle & 1 == 1. New (*cpu.CPU).CurrentCycle() accessor returns c.Cycles for peripherals that need cycle-aligned timing (DMC contention will reuse it next). Tests under internal/nes/dma/oamdma_test.go cover even-vs-odd entry; the existing cmd/nessy/wiring_test.go assertions update from 513 → 514 since cpu.Reset() leaves Cycles=7 (odd) so the canonical post-reset DMA path now exercises the odd branch. First instalment of #272's cycle-accuracy hardening pass.
  • nessy demo: mmc1-banks (issue #261): new roms/demos/mmc1-banks/ — MMC1 PRG bank-switching demo. Two switchable 16 KiB banks; each holds a single palette-color byte at its first address ($0F = black for bank 0, $30 = white for bank 1). Fixed bank ($C000-$FFFF) hosts reset + NMI handlers. NMI reads $8000, writes the byte to $3F00 (universal BG color), and every 30 NMIs toggles prgBank via the standard MMC1 5-write serial protocol to $E000. Visible: BG flashes between two colours roughly twice per second. Critical iNES layout note: a 2 PRG bank cart is exactly 32 KiB of PRG bytes; both the fixed code and the vectors live inside the second physical bank — separate VECTORS MEMORY region in the linker cfg breaks the layout (PRG ends up 48 KiB and the loader truncates). Cfg uses BANK0 + BANK1 memory regions, with FIXED and VECTORS segments both loaded into BANK1. Test TestDemo_MMC1Banks_BankSwitchVisible boots the ROM twice, snapshots framebuffers at frames 5 and 45 (intentionally mid-toggle), asserts SHA inequality. Validates the v0.3 MMC1 mapper (#248) end-to-end. Closes the deferred v0.3 demo follow-up.
  • nessy demo: dmc-sample (issue #260): new roms/demos/dmc-sample/ — DMC channel end-to-end demo. Loops a 65-byte alternating-bit sample ($AA × 33 + $55 × 32) at rate index 0 (~33 KHz bit rate, NTSC). Sample lives at PRG $F000 via a dedicated SAMPLE linker segment; $4012=$C0 (= ($F000-$C000)/64) + $4013=4 (= (65-1)/16 → 65 bytes). $4010=$40 sets loop bit + clears IRQ enable + rate 0; $4011=$40 mid-level start; $4015=$10 enables. Validates the v0.3 DMC pipeline (#246) end-to-end: DMA stall on byte fetch, delta-PCM 7-bit level toggle, loop bit reload on exhaustion. New apu.DMCBytesRemaining() uint16 accessor for headless tests to assert "loop kept the channel fed". Test TestDemo_DMCSample_LoopsAndEmits runs 30 frames + asserts non-silent samples + bytes-remaining > 0. Closes the deferred v0.3 demo follow-up.
  • nessy player UX (issue #273, nessy v0.4): three quality-of-life keybinds in cmd/nessy/game.go + a screenshot helper alongside frame-dump. Tab held = 4× fast-forward (multiplier on the per-Update CPU cycle budget; audio still drains so playback fast-pitches but doesn't desync). F11 = ebiten.SetFullscreen toggle. F12 = one-shot framebuffer PNG to $HOME/.nessy/screenshots/<rom>-<YYYYMMDD-HHMMSS>.png. Hotkeys edge-triggered via inpututil.IsKeyJustPressed so a held key doesn't re-fire. Screenshot reuses framedump.go's image/png path but writes to a distinct directory so diagnostic dumps (~/.nessy/dumps/) don't clutter the user-facing shots.
  • CPU stall hook + $4014 OAMDMA (issue #204, nessy v0.2): new cpu.Stall(int) API plus a pendingStall field on CPU. Step() drains the full counter as one block — bus ticker fires once, Cycles advances, no opcode executes. NMI/IRQ service still preempts a queued stall so a peripheral that asserts an interrupt mid-DMA gets serviced first (matches the rest of the interrupt path). New internal/nes/dma/ package with OAMDMA — a cpu.Peripheral claiming $4014-$4014. On write of byte $XX, walks 256 bytes from CPU page $XX00-$XXFF through the bus into the PPU's OAM cursor, then charges 513 stall cycles. Wired into cmd/nessy/wiring.go alongside cart + joypad + PPU; the nesBus struct gains a dma field for introspection. New PPU accessor OAM(byte) byte for side-effect-free test assertion. Tests cover Stall's queue/drain/NMI-preempt semantics (internal/cpu/stall_test.go), the DMA peripheral via fakes + a real MMIO round-trip (internal/nes/dma/oamdma_test.go), and an end-to-end STA $4014 against the live NES bus that verifies OAM contents + post-DMA stall drain (cmd/nessy/wiring_test.go). nestest + perfgate + Klaus all still pass. Out of scope for v0.2: the 514-cycle odd-CPU-cycle alignment penalty, DMC sample-DMA contention timing, per-byte sub-cycle accounting.

Open issues

  • v1.7.0 epic #458 + sub-issues #449–#457, #461–#463 (see §0 for the breakdown). Theme A COMPLETE: #449 (stack), #450 (flags), #451 (memory), #452 (disasm) all done — all five panels DAP-sourced (Registers was #394). See Merged PRs of note. Theme A done; #453 (data breakpoints) done; #461 render path done — all five panels + nav DAP-sourced, zero direct cpu/RAM in render/nav. #471 done — server-driven local run via Server.RunBudget; run/step/breakpoint/watchpoint enforcement is server-owned (see Merged PRs). Rich TUI rewind kept as the local engine exception; single-step/mem-edit stay direct (deferred, low value). #454 donesetVariable writes Globals scalars + array children (DAP parity complete). #455 + #475 doneTestHarte65C02BusTrace; CMOS per-cycle bus-exact for all 256 opcodes (empty skip list): the interleave is CMOS-aware (RMW dummy-read, indexed page-cross, JMP indirect, push/pull, WDC NOPs — #455) and BBR/BBS's dummy write-back + branch-target read are modeled (#475). See Merged PRs. #463 doneMMIO.Freeze/Unfreeze/Frozen/FrozenAddrs extends the freeze facility to peripheral/cart-mapped addresses (bus-level write-suppress). #456 DONE (+ #462 native, folded in) — the full WDC 65C816 core landed across 5 PRs: a from-scratch 16-bit execution engine (step816), all width-aware handlers + the new addressing modes (long/[dp]/stack-rel/MVN/MVP), 256/256 opcodes Tom Harte-validated in both emulation and native (254 via TestHarte65816, MVN/MVP via unit test), Bus24 + bank-0 bridge so -cpu 65816 runs in the TUI, and Disasm816 width-dependent disassembly. See Merged PRs + ADR 0010. #457 done — WASM playground drag-and-drop closed its last done-when item (the playground was already built + Pages-deployed). All v1.7.0 sub-issues now shipped — the release is ready to cut (needs the goreleaser tag + finalizing ADR 0010). The former "deferred to v2.0" set (#461 TUI flip, #462 65816 native, #463 freeze) was pulled into v1.7.0 — the flip is internal/-only so the public Go API stays additive (minor bump valid). Future 65816 work (not blocking v1.7): bank-aware bus + bank-aware TUI panels (cross-bank programs), per-cycle bus trace for the 65816 (#495 DONE — chunks 1-4, ADR 0012) — TestHarte65816BusTrace iterates all 256 opcodes and is per-cycle bus-exact in both emulation and native except four in harteBusSkip816: WAI/STP (halt with a None-address cycle the recorder can't model) and MVN/MVP (whole-block-move debugger model vs the corpus's mid-block cap, same as the state harness). Chunk 1: register/flag/transfer/immediate (io816). Chunk 2: addressing-mode / ALU-memory — DP +1 (dpIO), dp-index/stack-rel adds at PC-1 (ioPC1), the abs/(dp)-indexed cross cycle at the un-fixed address (indexIO), (sr),Y's +Y pointer re-read. Chunk 3: RMW (mode-dependent modify cycle — emulation dummy-WRITE-old / native dummy-READ, 16-bit hi-then-lo write order), stack push/pull (io@PC ×1/×2), branches (io@PC-1 + page-cross), JSR/JSL/RTS/RTL/RTI/BRK/COP interleaved pushes + signature/vector reads. Validates the full Harte cycle string — addr + value + all 8 pin bits (VDA/VPA/VPB/RWB/E/M/X/MLB), stricter than the 6502/65C02 traces: each access tags its type (c.busPins via pinData/pinProg/pinNone/pinVector + MLB across RMW), E/M/X snapshotted at instruction start (SEP/REP/XCE show pre-change widths). See ADR 0012.
  • 22 (homebrew-core) — still blocked on ~30 stars (release hygiene, not code).


5. Key Decisions & Rationale

Architecture

  • Variant-based CPU dispatch via per-CPU table pointer — chosen over a runtime switch in every opcode so future variants (65816, etc.) only need a new table file. Tables share NMOS as a base and override.
  • CMOS table init via copy-then-override, relying on Go's init() lex file ordering (opcodes.go < opcodes_cmos.go < opcodes_illegal.go). This is a load-bearing invariant — renaming files could break the init chain.
  • ZPR addressing handler self-fetches operand bytes and self-advances PC; resolve() returns (0, false) for ZPR. Simpler than encoding both zp byte and rel target through resolve.
  • Disassembler is variant-aware (PR #55, issue #42). Legacy Disasm / DisasmWithSyms still use the NMOS table for back-compat; DisasmCPU / DisasmCPUWithSyms route through c.opcodes so CMOS-only mnemonics (STZ, PHX, BRA, etc.) render correctly. TUI + trace switched to the CPU-aware path.

Bug fixes worth remembering

  • PR #31: branch() was mutating c.Cycles directly but Step() returned only in.Cycles. Result: taken branches undercounted return value by 1–2. Fix: extraCycles int field, reset each Step, folded into the return.
  • Test gotcha: r.Load(addr, prog) then later r.Write(addr, x) clobbers the opcode. Discovered while writing the JMP (ind) wrap test — fixed by placing program at $8200 and using $8000 only as wrap-target sentinel.

Tooling

  • CI matrix: ubuntu + macos + windows × Go stable. Lint and Klaus jobs ubuntu-only. Coverage uploaded only from the ubuntu test job.
  • golangci-lint v2 syntax. errcheck excludes (*os.File).Close, bytes.Buffer / strings.Builder writers, and fmt.Fprint* family.
  • -covermode=atomic is required with -race. fail_ci_if_error: false on Codecov so transient upload failures don't break the build.
  • License: MIT. GPL test ROMs (Klaus 6502_65C02_functional_tests) are NOT vendored — downloaded on demand with sha256 verification (fa12bfc761e6f9057e4cc01a665a7b800ff01ae91f598af1e39a1201d01953fd).

UI

  • Sigils mirror nvim-DAP:
  • 🛑 plain breakpoint
  • 👉 PC
  • 🔶 conditional
  • 💩 rejected
  • 📜 logpoint
  • 👁 read watch
  • ✏ write watch
  • 🔁 R+W watch
  • Wide emoji (2 cells) in marker column → drop leading space to keep address column aligned.

6. Critical Context

Local commands

# Standard build+test
go build ./... && go test -race -count=1 ./...

# CMOS-only tests
go test -count=1 -run 'TestCMOS|TestNMOS|TestVariant' -v ./internal/cpu/...

# Coverage
go test -race -count=1 -coverprofile=coverage.out -covermode=atomic ./...

# Lint
golangci-lint run ./...
golangci-lint run --build-tags=klaus ./...

# Klaus functional test (build-tagged)
go test -tags=klaus -timeout 5m -run TestKlaus -v ./internal/cpu/...

# Build example
make -C example cmos_demo.bin
make -C example run-cmos_demo

CLI flags

chippy -rom <file> [-addr 0x8000] [-reset 0xADDR] [-cfg linker.cfg] [-dbg syms.dbg] [--cpu nmos|65c02]
- -rom — program to load (.bin .prg .hex .o) - -addr — load address for raw .bin (default 0x8000) - -reset — reset vector override (0 = use file's vector or load addr) - -cfg — ld65 linker config; required for .o files - -dbg — cc65 .dbg symbol file (auto-detected as <rom>.dbg if omitted) - --cpunmos (default) | 6502 | 65c02 | cmos | cmos65c02

Toolchain locations (macOS)

  • ca65, ld65, cc65 at /opt/homebrew/bin/

References

  • 65C02 opcodes: http://www.6502.org/tutorials/65c02opcodes.html
  • 65C02 opcode matrix: http://www.oxyron.de/html/opcodesc02.html
  • NMOS vs CMOS differences: http://wilsonminesco.com/NMOS-CMOSdif/
  • Klaus 6502_65C02_functional_tests: https://github.com/Klaus2m5/6502_65C02_functional_tests

7. File Map (key files)

CPU core

  • internal/cpu/cpu.goCPU struct, Variant enum, New() / NewVariant(), Reset(), bindTable(), interrupt API (AssertIRQ/ReleaseIRQ/TriggerNMI), service routines, flag helpers
  • internal/cpu/exec.goStep(), interrupt boundary service, addressing-mode load/store helpers, all opcode handlers (LDA/STA/ADC/SBC/branches/etc.)
  • internal/cpu/addressing.goAddrMode enum, resolve(); IZP/IAX/ZPR modes for CMOS; IND mode variant-branched
  • internal/cpu/opcodes.go — NMOS opcode table (199 LOC)
  • internal/cpu/opcodes_cmos.go — CMOS overrides (BRA, PHX/PHY/PLX/PLY, STZ, TRB, TSB, INA/DEA, BIT #imm, RMB/SMB/BBR/BBS, adcDecimalCMOS, sbcDecimalCMOS, cmosNOPs)
  • internal/cpu/opcodes_illegal.go — NMOS unofficial opcodes (320 LOC)
  • internal/cpu/disasm.go — disassembler; variant-aware via DisasmCPU / DisasmCPUWithSyms. Legacy NMOS-fixed Disasm still exported for callers without a CPU handy.
  • internal/cpu/memory.goBus interface + RAM impl

Tests

  • internal/cpu/cpu_test.go — base helpers, LDA/ADC/etc. regression tests
  • internal/cpu/cycles_test.go — 4 cycle-count regression tests (PR #31)
  • internal/cpu/cmos_test.go — 15 CMOS regression tests
  • internal/cpu/cmos_e2e_test.go — loads example/cmos_demo.bin, runs under CMOS, asserts state; self-skips when bin absent
  • internal/cpu/interrupts_test.go — 10 IRQ/NMI tests (PR #33)
  • internal/cpu/klaus_test.go — build-tagged Klaus harness (PR #30); pattern reusable for BCD/decimal suites

TUI

  • internal/tui/model.go — Bubble Tea model, run loop, panel layout, key bindings
  • internal/tui/wbus.goWBus wraps cpu.Bus, captures hits for memory watchpoints, ring buffer
  • internal/tui/bp.go — breakpoints
  • internal/tui/cond.go — conditional breakpoint expressions
  • internal/tui/membp.go / internal/tui/membp_test.go — memory breakpoints
  • internal/tui/prompt.go — command prompt
  • internal/tui/state.go — persistence (~/.chippy/state-<rom>.json)

Other

  • cmd/chippy/main.go — CLI entry; flag parsing; bus wrap chain
  • internal/loader/.bin/.prg/.hex/.o loaders (ld65 invoked for .o)
  • internal/symbols/.dbg parser, symbol table, source map
  • example/Makefilecmos_demo target uses --cpu 65c02
  • example/cmos_demo.s.setcpu "65c02"; LDA/LDX/LDY/PHX/PHY/STZ/INC A/BRA/JMP self
  • .gitignore — ignores *.bin/*.o/*.dbg/*.prg/*.hex/*.lst/*.map
  • .github/workflows/ci.yml — 3-OS test matrix + lint + Codecov + klaus job (ubuntu-only)
  • .github/workflows/release.yml — goreleaser on tag push
  • .goreleaser.yml — multi-arch binaries + brew cask publish
  • nkane/homebrew-tap repo — Casks/chippy.rb auto-updated by goreleaser (migrated formula→cask in #413; goreleaser deprecated brews: for pre-built binaries)

8. Next Steps (immediate)

  1. Choose next from open issues: #17 (reverse step), #18 (stack panel), #19 (mem editor), #20 (prompt history). #22 (homebrew-core) is gated on ~30 stars.
  2. Deferred: CI job for the CMOS e2e test (self-skips because binary is gitignored).
  3. Possible: integrate Bruce Clark's BCD timing test or 6502_decimal_test as a klaus-style build-tagged suite — would also exercise the CMOS BCD path.
  4. User-side: mascot image generation (prompts in docs/mascot-prompts.md).

9. Gotchas

  • The nkane/homebrew-tap formula update flow requires the HOMEBREW_TAP_GITHUB_TOKEN secret to remain valid — rotate if expired.
  • The Klaus ROM URL or sha256 changing would silently break CI's klaus job; pin is in internal/cpu/klaus_test.go.
  • CMOS table init relies on file-lexicographic Go init() ordering. Renaming opcodes_cmos.go to come after opcodes_illegal.go would cause illegals to bleed into the CMOS table.
  • Step() returns total cycles including interrupt service. Callers wanting just-the-instruction count would need separate tracking.
  • WBus reads c.PC after c.PC++, so logged PC is one past the opcode for fetches. Tests assume this.