Skip to content

refactor(plan): collapse Needs typed maps to generic attrCode dispatch#1475

Merged
gfyrag merged 1 commit into
release/v3.0from
refactor/EN-1242-generic-needs
Jul 3, 2026
Merged

refactor(plan): collapse Needs typed maps to generic attrCode dispatch#1475
gfyrag merged 1 commit into
release/v3.0from
refactor/EN-1242-generic-needs

Conversation

@gfyrag

@gfyrag gfyrag commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Summary

Data-structure refactor of plan.Needs — no semantic change vs release/v3.0.

Collapses 13 typed key maps (Ledgers, Boundaries, Volumes, References, Metadata, Transactions, SinkConfigs, NumscriptVersions, NumscriptContents, PreparedQueries, LedgerMetadata, Indexes, IdempotencyKeys) into a single generic dispatch:

type AttributeSet struct { Keys map[string]struct{} }

type Needs struct {
    Attributes      map[byte]*AttributeSet // keyed by dal.SubAttrX
    IdempotencyKeys map[domain.IdempotencyKey]struct{}
}

Admission and the mirror worker use p.Add(dal.SubAttrX, key.Bytes()) instead of per-type map assignments. Resolver dispatch lives in a new attribute_resolvers.go file — one protoAttrResolver[T] per attribute cache holds the cache/loader/store triple; the builder iterates the registration table.

Why

Adding a new attribute cache used to require edits across five files: a field on Needs, a case in Merge(), a length in AttributeKeysCount(), a bit in coverage_bits, and a new K-generic instantiation of resolveAttributePreload. The refactor drops that to a one-line entry in buildAttrResolvers.

What's preserved (release/v3.0 semantics)

  • Emission map: CacheGuaranteedDeclare, CacheNeedsTouchTouch, CacheMiss + bloom/Pebble-absentDeclare, CacheMiss + Pebble-load-hitValue(v). Byte-for-byte compatible with pre-refactor ExecutionPlan wire format.
  • Cache primitives: AttributeCache.Get keeps its gen0 → gen1 fallback; AttributeCache.Del keeps its strict Gen0 contract.
  • FSM Preload switch (Declare / Touch / Value dispatch, touch_noop panic on MirrorTouch).

Bug-fix side effects

Closes a few pre-existing invariant #6 gaps surfaced by the audit — sites where the FSM apply path called AttributeCache.Del (requires Gen0) but admission never declared the target:

  • DeleteLedger cascade → Boundaries preload
  • DeleteMetadata / MirrorIngest.DeletedMetadata account-target → AccountMetadata preload
  • DeleteLedgerMetadata → LedgerMetadata preload
  • DeletePreparedQuery → PreparedQueries preload
  • DropIndex / RemoveMetadataFieldType cascade → Indexes preload
  • RemoveEventsSink cascade → SinkConfigs preload

These missing declarations would have silently no-oped the FSM read and desynced nodes on any concurrent-rotation path. All strict-Del sites now declare coverage explicitly.

Follow-up

Chained PR #1455 layers the unified coverage model (gen0-strict Get, MirrorTouch silent no-op, removal of intent-verification machinery) on top of this refactor.

Test plan

  • GOROOT= go build ./... — clean
  • GOROOT= just pre-commit — 0 issues
  • Unit tests (go test ./...) — all green
  • Full CI on this branch

@coderabbitai

coderabbitai Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

🗂️ Base branches to auto review (1)
  • main

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6340c4ea-cbae-4ee9-85b9-87636177c315

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch refactor/EN-1242-generic-needs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@NumaryBot

NumaryBot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

✅ Approve — automated review

The refactor collapses typed Needs maps into a generic attrCode dispatch. The single reviewer found no discrete correctness issues, and all prior automated feedback was consistent in approving the change. The 45-line coverage gap (~83% patch coverage) was already acknowledged in the prior PR discussion and is not severe enough to block merging. No concrete, grounded findings were raised.

No findings.

@codecov

codecov Bot commented Jul 2, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 92.53731% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.83%. Comparing base (7a5e7c9) to head (5bb9f23).
⚠️ Report is 1 commits behind head on release/v3.0.

Files with missing lines Patch % Lines
internal/application/mirror/worker.go 60.00% 7 Missing and 1 partial ⚠️
internal/infra/plan/builder.go 75.75% 6 Missing and 2 partials ⚠️
internal/infra/plan/needs.go 89.47% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@               Coverage Diff                @@
##           release/v3.0    #1475      +/-   ##
================================================
+ Coverage         69.56%   73.83%   +4.27%     
================================================
  Files               389      390       +1     
  Lines             39820    39749      -71     
================================================
+ Hits              27700    29348    +1648     
+ Misses             9699     7855    -1844     
- Partials           2421     2546     +125     
Flag Coverage Δ
e2e 73.83% <92.53%> (+4.27%) ⬆️
scenario 73.83% <92.53%> (+4.27%) ⬆️
unit 73.83% <92.53%> (+4.27%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

gfyrag pushed a commit that referenced this pull request Jul 2, 2026
…as promote primitive

Restores strict AttributeCache.Del (Gen0 must hold the key at delete time)
by shifting the invariant from "admission emits per-intent assertions" to
"Preload uniformly promotes gen1→gen0 for every declared key". The
propose→apply race window is bounded by the admission-side CacheUnreachable
guard (2+ predicted rotations rejected up front, PR #1458), and the FSM
coverage gate (invariant #9) binds the read horizon to admission's
declared preload set — those two guardrails make Declare + MirrorTouch a
complete substitute for the historical AssertPresent / ExpectAbsent /
Touch machinery.

Semantic changes:

- `AttributeCache.Get` is gen0-strict — no gen1 fallback. `MirrorTouch`
  promotes any gen1 entry into gen0 at Preload before handlers run.
- `AttributeCache.Del` returns an error (kv.KV signature updated) and
  refuses to fabricate a tombstone from Gen1. Preload-contract violations
  bubble up loudly per invariant #7 instead of silently desyncing.
- `writeCacheTombstone` writes only the current gen0 byte in 0xFF (the
  gen1 byte is left as harmless stale data, purged on the next rotation)
  — keeping cache mem/disk equal for the same applied index (invariant #1).
- `protoSnapshotSlot.MirrorTouch` is a silent no-op when neither gen has
  the key. Under the coverage gate that state is legitimate: a Declare
  entry may cover a key that is only in Pebble (Value intent seeds those)
  or genuinely absent (handler ErrNotFound path).
- Resolver: `CacheGuaranteed` and `CacheNeedsTouch` both emit `Declare`
  now; `CacheMiss + Pebble-load-hit` still emits `Value(v)`. The proto
  variants `Touch` / `AssertPresent` / `ExpectAbsent` stay in raft_cmd.proto
  for wire compatibility with pre-EN-1242 plans but the resolver never
  emits them and the FSM Preload switch treats every non-Value intent
  identically via `MirrorTouch`.

Removed machinery:

- `plan.ErrStalePreload` sentinel (moved to a leaf `planerr` package so
  admission and state can reference `ErrCacheHorizonExceeded` without a
  cycle — that's all `planerr` holds now).
- `CacheSnapshotter.HasGen0Entry` / `HasLiveEntry`.
- The `touch_noop` panic that fired when `MirrorTouch` couldn't find the
  key in either generation.

Docs: `docs/technical/architecture/core/plan-intent-verification.md`
rewritten for the unified model — problem statement, Path A / Path B
race traces, the two guardrails, and the delete-site checklist.

Layered on top of PR #1475 (Needs generic dispatch refactor).
@gfyrag gfyrag force-pushed the refactor/EN-1242-generic-needs branch from a122827 to 7226eb9 Compare July 2, 2026 10:44
gfyrag pushed a commit that referenced this pull request Jul 2, 2026
…as promote primitive

Restores strict AttributeCache.Del (Gen0 must hold the key at delete time)
by shifting the invariant from "admission emits per-intent assertions" to
"Preload uniformly promotes gen1→gen0 for every declared key". The
propose→apply race window is bounded by the admission-side CacheUnreachable
guard (2+ predicted rotations rejected up front, PR #1458), and the FSM
coverage gate (invariant #9) binds the read horizon to admission's
declared preload set — those two guardrails make Declare + MirrorTouch a
complete substitute for the historical AssertPresent / ExpectAbsent /
Touch machinery.

Semantic changes:

- `AttributeCache.Get` is gen0-strict — no gen1 fallback. `MirrorTouch`
  promotes any gen1 entry into gen0 at Preload before handlers run.
- `AttributeCache.Del` returns an error (kv.KV signature updated) and
  refuses to fabricate a tombstone from Gen1. Preload-contract violations
  bubble up loudly per invariant #7 instead of silently desyncing.
- `writeCacheTombstone` writes only the current gen0 byte in 0xFF (the
  gen1 byte is left as harmless stale data, purged on the next rotation)
  — keeping cache mem/disk equal for the same applied index (invariant #1).
- `protoSnapshotSlot.MirrorTouch` is a silent no-op when neither gen has
  the key. Under the coverage gate that state is legitimate: a Declare
  entry may cover a key that is only in Pebble (Value intent seeds those)
  or genuinely absent (handler ErrNotFound path).
- Resolver: `CacheGuaranteed` and `CacheNeedsTouch` both emit `Declare`
  now; `CacheMiss + Pebble-load-hit` still emits `Value(v)`. The proto
  variants `Touch` / `AssertPresent` / `ExpectAbsent` stay in raft_cmd.proto
  for wire compatibility with pre-EN-1242 plans but the resolver never
  emits them and the FSM Preload switch treats every non-Value intent
  identically via `MirrorTouch`.

Removed machinery:

- `plan.ErrStalePreload` sentinel (moved to a leaf `planerr` package so
  admission and state can reference `ErrCacheHorizonExceeded` without a
  cycle — that's all `planerr` holds now).
- `CacheSnapshotter.HasGen0Entry` / `HasLiveEntry`.
- The `touch_noop` panic that fired when `MirrorTouch` couldn't find the
  key in either generation.

Docs: `docs/technical/architecture/core/plan-intent-verification.md`
rewritten for the unified model — problem statement, Path A / Path B
race traces, the two guardrails, and the delete-site checklist.

Layered on top of PR #1475 (Needs generic dispatch refactor).
gfyrag pushed a commit that referenced this pull request Jul 2, 2026
…as promote primitive

Restores strict AttributeCache.Del (Gen0 must hold the key at delete time)
by shifting the invariant from "admission emits per-intent assertions" to
"Preload uniformly promotes gen1→gen0 for every declared key". The
propose→apply race window is bounded by the admission-side CacheUnreachable
guard (2+ predicted rotations rejected up front, PR #1458), and the FSM
coverage gate (invariant #9) binds the read horizon to admission's
declared preload set — those two guardrails make Declare + MirrorTouch a
complete substitute for the historical AssertPresent / ExpectAbsent /
Touch machinery.

Semantic changes:

- `AttributeCache.Get` is gen0-strict — no gen1 fallback. `MirrorTouch`
  promotes any gen1 entry into gen0 at Preload before handlers run.
- `AttributeCache.Del` returns an error (kv.KV signature updated) and
  refuses to fabricate a tombstone from Gen1. Preload-contract violations
  bubble up loudly per invariant #7 instead of silently desyncing.
- `writeCacheTombstone` writes only the current gen0 byte in 0xFF (the
  gen1 byte is left as harmless stale data, purged on the next rotation)
  — keeping cache mem/disk equal for the same applied index (invariant #1).
- `protoSnapshotSlot.MirrorTouch` is a silent no-op when neither gen has
  the key. Under the coverage gate that state is legitimate: a Declare
  entry may cover a key that is only in Pebble (Value intent seeds those)
  or genuinely absent (handler ErrNotFound path).
- Resolver: `CacheGuaranteed` and `CacheNeedsTouch` both emit `Declare`
  now; `CacheMiss + Pebble-load-hit` still emits `Value(v)`. The proto
  variants `Touch` / `AssertPresent` / `ExpectAbsent` stay in raft_cmd.proto
  for wire compatibility with pre-EN-1242 plans but the resolver never
  emits them and the FSM Preload switch treats every non-Value intent
  identically via `MirrorTouch`.

Removed machinery:

- `plan.ErrStalePreload` sentinel (moved to a leaf `planerr` package so
  admission and state can reference `ErrCacheHorizonExceeded` without a
  cycle — that's all `planerr` holds now).
- `CacheSnapshotter.HasGen0Entry` / `HasLiveEntry`.
- The `touch_noop` panic that fired when `MirrorTouch` couldn't find the
  key in either generation.

Docs: `docs/technical/architecture/core/plan-intent-verification.md`
rewritten for the unified model — problem statement, Path A / Path B
race traces, the two guardrails, and the delete-site checklist.

Layered on top of PR #1475 (Needs generic dispatch refactor).
gfyrag pushed a commit that referenced this pull request Jul 2, 2026
…as promote primitive

Fixes a strict AttributeCache.Del contract violation on Delete-like
orders when a concurrent Save + rotation runs between propose and apply:

  T0 Admission: CheckCache(k) → CacheMiss → Declare (no seed)
  T1 Concurrent Save(k, v)         → Gen0[k] = v
  T2 Cache rotation                → Gen1[k] = v, Gen0[k] = ∅
  T3 Delete(k) applies             → strict Del sees Gen0∅ → PANIC

The fix relies on two orthogonal guardrails already in force on
release/v3.0:

  1. Admission-side CacheUnreachable rejects any proposal predicting
     2+ rotations between propose and apply (plan.ErrCacheHorizonExceeded,
     gRPC Unavailable / HTTP 503). This bounds the propose→apply race
     to at most one rotation, so any key admission observed anywhere in
     cache is still reachable via Gen1 at Preload.
  2. The coverage gate (invariant #9) binds every FSM cache read to
     admission's declared preload set — a Declare entry can only be
     read by an order that emitted it.

Given those bounds, Preload-time MirrorTouch is a complete substitute
for per-intent verification: for every Declare plan entry, promote any
Gen1 hit into Gen0 before the handler runs. Applied to T3 above, Gen0
gets seeded with v before strict Del runs and the panic is gone.

Semantic changes on the release/v3.0 baseline:

- AttributeCache.Get is gen0-strict — no gen1 fallback. MirrorTouch has
  already promoted anything that exists anywhere in cache before the
  handler reads.
- AttributeCache.Del returns an error (kv.KV signature updated) and
  refuses to fabricate a tombstone from Gen1. Preload-contract
  violations bubble up loudly per invariant #7 instead of silently
  desyncing.
- writeCacheTombstone writes only the current gen0 byte in 0xFF (the
  gen1 byte is left as harmless stale data, purged on the next
  rotation) — keeping cache mem/disk equal for the same applied index
  (invariant #1).
- protoSnapshotSlot.MirrorTouch is a silent no-op when neither gen has
  the key (previous behavior: touch_noop panic). Under the coverage
  gate that state is legitimate.
- Resolver: CacheGuaranteed and CacheNeedsTouch both emit Declare now;
  CacheMiss + Pebble-load-hit still emits Value(v). The proto's Touch
  variant is retained for wire compat with pre-refactor plans still in
  flight during a rolling upgrade (FSM Preload treats it identically to
  Declare via MirrorTouch).

Docs: docs/technical/architecture/core/plan-intent-verification.md
walks the Path A race, the two guardrails, and the strict-Del site
checklist.

Layered on top of PR #1475 (Needs generic dispatch refactor).
gfyrag pushed a commit that referenced this pull request Jul 2, 2026
…as promote primitive

Fixes a strict AttributeCache.Del contract violation on Delete-like
orders when a concurrent Save + rotation runs between propose and apply:

  T0 Admission: CheckCache(k) → CacheMiss → Declare (no seed)
  T1 Concurrent Save(k, v)         → Gen0[k] = v
  T2 Cache rotation                → Gen1[k] = v, Gen0[k] = ∅
  T3 Delete(k) applies             → strict Del sees Gen0∅ → PANIC

The fix relies on two orthogonal guardrails already in force on
release/v3.0:

  1. Admission-side CacheUnreachable rejects any proposal predicting
     2+ rotations between propose and apply (plan.ErrCacheHorizonExceeded,
     gRPC Unavailable / HTTP 503). This bounds the propose→apply race
     to at most one rotation, so any key admission observed anywhere in
     cache is still reachable via Gen1 at Preload.
  2. The coverage gate (invariant #9) binds every FSM cache read to
     admission's declared preload set — a Declare entry can only be
     read by an order that emitted it.

Given those bounds, Preload-time MirrorTouch is a complete substitute
for per-intent verification: for every Declare plan entry, promote any
Gen1 hit into Gen0 before the handler runs. Applied to T3 above, Gen0
gets seeded with v before strict Del runs and the panic is gone.

Semantic changes on the release/v3.0 baseline:

- AttributeCache.Get is gen0-strict — no gen1 fallback. MirrorTouch has
  already promoted anything that exists anywhere in cache before the
  handler reads.
- AttributeCache.Del returns an error (kv.KV signature updated) and
  refuses to fabricate a tombstone from Gen1. Preload-contract
  violations bubble up loudly per invariant #7 instead of silently
  desyncing.
- writeCacheTombstone writes only the current gen0 byte in 0xFF (the
  gen1 byte is left as harmless stale data, purged on the next
  rotation) — keeping cache mem/disk equal for the same applied index
  (invariant #1).
- protoSnapshotSlot.MirrorTouch is a silent no-op when neither gen has
  the key (previous behavior: touch_noop panic). Under the coverage
  gate that state is legitimate.
- Resolver: CacheGuaranteed and CacheNeedsTouch both emit Declare now;
  CacheMiss + Pebble-load-hit still emits Value(v). The proto's Touch
  variant is retained for wire compat with pre-refactor plans still in
  flight during a rolling upgrade (FSM Preload treats it identically to
  Declare via MirrorTouch).

Docs: docs/technical/architecture/core/plan-intent-verification.md
walks the Path A race, the two guardrails, and the strict-Del site
checklist.

Layered on top of PR #1475 (Needs generic dispatch refactor).
gfyrag pushed a commit that referenced this pull request Jul 2, 2026
…as promote primitive

Fixes a strict AttributeCache.Del contract violation on Delete-like
orders when a concurrent Save + rotation runs between propose and apply:

  T0 Admission: CheckCache(k) → CacheMiss → Declare (no seed)
  T1 Concurrent Save(k, v)         → Gen0[k] = v
  T2 Cache rotation                → Gen1[k] = v, Gen0[k] = ∅
  T3 Delete(k) applies             → strict Del sees Gen0∅ → PANIC

The fix relies on two orthogonal guardrails already in force on
release/v3.0:

  1. Admission-side CacheUnreachable rejects any proposal predicting
     2+ rotations between propose and apply (plan.ErrCacheHorizonExceeded,
     gRPC Unavailable / HTTP 503). This bounds the propose→apply race
     to at most one rotation, so any key admission observed anywhere in
     cache is still reachable via Gen1 at Preload.
  2. The coverage gate (invariant #9) binds every FSM cache read to
     admission's declared preload set — a Declare entry can only be
     read by an order that emitted it.

Given those bounds, Preload-time MirrorTouch is a complete substitute
for per-intent verification: for every Declare plan entry, promote any
Gen1 hit into Gen0 before the handler runs. Applied to T3 above, Gen0
gets seeded with v before strict Del runs and the panic is gone.

Semantic changes on the release/v3.0 baseline:

- AttributeCache.Get is gen0-strict — no gen1 fallback. MirrorTouch has
  already promoted anything that exists anywhere in cache before the
  handler reads.
- AttributeCache.Del returns an error (kv.KV signature updated) and
  refuses to fabricate a tombstone from Gen1. Preload-contract
  violations bubble up loudly per invariant #7 instead of silently
  desyncing.
- writeCacheTombstone writes only the current gen0 byte in 0xFF (the
  gen1 byte is left as harmless stale data, purged on the next
  rotation) — keeping cache mem/disk equal for the same applied index
  (invariant #1).
- protoSnapshotSlot.MirrorTouch is a silent no-op when neither gen has
  the key (previous behavior: touch_noop panic). Under the coverage
  gate that state is legitimate.
- Resolver: CacheGuaranteed and CacheNeedsTouch both emit Declare now;
  CacheMiss + Pebble-load-hit still emits Value(v). The proto's Touch
  variant is retained for wire compat with pre-refactor plans still in
  flight during a rolling upgrade (FSM Preload treats it identically to
  Declare via MirrorTouch).

Docs: docs/technical/architecture/core/plan-intent-verification.md
walks the Path A race, the two guardrails, and the strict-Del site
checklist.

Layered on top of PR #1475 (Needs generic dispatch refactor).
gfyrag pushed a commit that referenced this pull request Jul 2, 2026
…as promote primitive

Fixes a strict AttributeCache.Del contract violation on Delete-like
orders when a concurrent Save + rotation runs between propose and apply:

  T0 Admission: CheckCache(k) → CacheMiss → Declare (no seed)
  T1 Concurrent Save(k, v)         → Gen0[k] = v
  T2 Cache rotation                → Gen1[k] = v, Gen0[k] = ∅
  T3 Delete(k) applies             → strict Del sees Gen0∅ → PANIC

The fix relies on two orthogonal guardrails already in force on
release/v3.0:

  1. Admission-side CacheUnreachable rejects any proposal predicting
     2+ rotations between propose and apply (plan.ErrCacheHorizonExceeded,
     gRPC Unavailable / HTTP 503). This bounds the propose→apply race
     to at most one rotation, so any key admission observed anywhere in
     cache is still reachable via Gen1 at Preload.
  2. The coverage gate (invariant #9) binds every FSM cache read to
     admission's declared preload set — a Declare entry can only be
     read by an order that emitted it.

Given those bounds, Preload-time MirrorTouch is a complete substitute
for per-intent verification: for every Declare plan entry, promote any
Gen1 hit into Gen0 before the handler runs. Applied to T3 above, Gen0
gets seeded with v before strict Del runs and the panic is gone.

Semantic changes on the release/v3.0 baseline:

- AttributeCache.Get is gen0-strict — no gen1 fallback. MirrorTouch has
  already promoted anything that exists anywhere in cache before the
  handler reads.
- AttributeCache.Del returns an error (kv.KV signature updated) and
  refuses to fabricate a tombstone from Gen1. Preload-contract
  violations bubble up loudly per invariant #7 instead of silently
  desyncing.
- writeCacheTombstone writes only the current gen0 byte in 0xFF (the
  gen1 byte is left as harmless stale data, purged on the next
  rotation) — keeping cache mem/disk equal for the same applied index
  (invariant #1).
- protoSnapshotSlot.MirrorTouch is a silent no-op when neither gen has
  the key (previous behavior: touch_noop panic). Under the coverage
  gate that state is legitimate.
- Resolver: CacheGuaranteed and CacheNeedsTouch both emit Declare now;
  CacheMiss + Pebble-load-hit still emits Value(v). The proto's Touch
  variant is retained for wire compat with pre-refactor plans still in
  flight during a rolling upgrade (FSM Preload treats it identically to
  Declare via MirrorTouch).

Docs: docs/technical/architecture/core/plan-intent-verification.md
walks the Path A race, the two guardrails, and the strict-Del site
checklist.

Layered on top of PR #1475 (Needs generic dispatch refactor).
gfyrag pushed a commit that referenced this pull request Jul 2, 2026
…as promote primitive

Fixes a strict AttributeCache.Del contract violation on Delete-like
orders when a concurrent Save + rotation runs between propose and apply:

  T0 Admission: CheckCache(k) → CacheMiss → Declare (no seed)
  T1 Concurrent Save(k, v)         → Gen0[k] = v
  T2 Cache rotation                → Gen1[k] = v, Gen0[k] = ∅
  T3 Delete(k) applies             → strict Del sees Gen0∅ → PANIC

The fix relies on two orthogonal guardrails already in force on
release/v3.0:

  1. Admission-side CacheUnreachable rejects any proposal predicting
     2+ rotations between propose and apply (plan.ErrCacheHorizonExceeded,
     gRPC Unavailable / HTTP 503). This bounds the propose→apply race
     to at most one rotation, so any key admission observed anywhere in
     cache is still reachable via Gen1 at Preload.
  2. The coverage gate (invariant #9) binds every FSM cache read to
     admission's declared preload set — a Declare entry can only be
     read by an order that emitted it.

Given those bounds, Preload-time MirrorTouch is a complete substitute
for per-intent verification: for every Declare plan entry, promote any
Gen1 hit into Gen0 before the handler runs. Applied to T3 above, Gen0
gets seeded with v before strict Del runs and the panic is gone.

Semantic changes on the release/v3.0 baseline:

- AttributeCache.Get is gen0-strict — no gen1 fallback. MirrorTouch has
  already promoted anything that exists anywhere in cache before the
  handler reads.
- AttributeCache.Del returns an error (kv.KV signature updated) and
  refuses to fabricate a tombstone from Gen1. Preload-contract
  violations bubble up loudly per invariant #7 instead of silently
  desyncing.
- writeCacheTombstone writes only the current gen0 byte in 0xFF (the
  gen1 byte is left as harmless stale data, purged on the next
  rotation) — keeping cache mem/disk equal for the same applied index
  (invariant #1).
- protoSnapshotSlot.MirrorTouch is a silent no-op when neither gen has
  the key (previous behavior: touch_noop panic). Under the coverage
  gate that state is legitimate.
- Resolver: CacheGuaranteed and CacheNeedsTouch both emit Declare now;
  CacheMiss + Pebble-load-hit still emits Value(v). The proto's Touch
  variant is retained for wire compat with pre-refactor plans still in
  flight during a rolling upgrade (FSM Preload treats it identically to
  Declare via MirrorTouch).

Docs: docs/technical/architecture/core/plan-intent-verification.md
walks the Path A race, the two guardrails, and the strict-Del site
checklist.

Layered on top of PR #1475 (Needs generic dispatch refactor).
gfyrag pushed a commit that referenced this pull request Jul 2, 2026
…as promote primitive

Fixes a strict AttributeCache.Del contract violation on Delete-like
orders when a concurrent Save + rotation runs between propose and apply:

  T0 Admission: CheckCache(k) → CacheMiss → Declare (no seed)
  T1 Concurrent Save(k, v)         → Gen0[k] = v
  T2 Cache rotation                → Gen1[k] = v, Gen0[k] = ∅
  T3 Delete(k) applies             → strict Del sees Gen0∅ → PANIC

The fix relies on two orthogonal guardrails already in force on
release/v3.0:

  1. Admission-side CacheUnreachable rejects any proposal predicting
     2+ rotations between propose and apply (plan.ErrCacheHorizonExceeded,
     gRPC Unavailable / HTTP 503). This bounds the propose→apply race
     to at most one rotation, so any key admission observed anywhere in
     cache is still reachable via Gen1 at Preload.
  2. The coverage gate (invariant #9) binds every FSM cache read to
     admission's declared preload set — a Declare entry can only be
     read by an order that emitted it.

Given those bounds, Preload-time MirrorTouch is a complete substitute
for per-intent verification: for every Declare plan entry, promote any
Gen1 hit into Gen0 before the handler runs. Applied to T3 above, Gen0
gets seeded with v before strict Del runs and the panic is gone.

Semantic changes on the release/v3.0 baseline:

- AttributeCache.Get is gen0-strict — no gen1 fallback. MirrorTouch has
  already promoted anything that exists anywhere in cache before the
  handler reads.
- AttributeCache.Del returns an error (kv.KV signature updated) and
  refuses to fabricate a tombstone from Gen1. Preload-contract
  violations bubble up loudly per invariant #7 instead of silently
  desyncing.
- writeCacheTombstone writes only the current gen0 byte in 0xFF (the
  gen1 byte is left as harmless stale data, purged on the next
  rotation) — keeping cache mem/disk equal for the same applied index
  (invariant #1).
- protoSnapshotSlot.MirrorTouch is a silent no-op when neither gen has
  the key (previous behavior: touch_noop panic). Under the coverage
  gate that state is legitimate.
- Resolver: CacheGuaranteed and CacheNeedsTouch both emit Declare now;
  CacheMiss + Pebble-load-hit still emits Value(v). The proto's Touch
  variant is retained for wire compat with pre-refactor plans still in
  flight during a rolling upgrade (FSM Preload treats it identically to
  Declare via MirrorTouch).

Docs: docs/technical/architecture/core/plan-intent-verification.md
walks the Path A race, the two guardrails, and the strict-Del site
checklist.

Layered on top of PR #1475 (Needs generic dispatch refactor).
gfyrag pushed a commit that referenced this pull request Jul 2, 2026
…as promote primitive

Fixes a strict AttributeCache.Del contract violation on Delete-like
orders when a concurrent Save + rotation runs between propose and apply:

  T0 Admission: CheckCache(k) → CacheMiss → Declare (no seed)
  T1 Concurrent Save(k, v)         → Gen0[k] = v
  T2 Cache rotation                → Gen1[k] = v, Gen0[k] = ∅
  T3 Delete(k) applies             → strict Del sees Gen0∅ → PANIC

The fix relies on two orthogonal guardrails already in force on
release/v3.0:

  1. Admission-side CacheUnreachable rejects any proposal predicting
     2+ rotations between propose and apply (plan.ErrCacheHorizonExceeded,
     gRPC Unavailable / HTTP 503). This bounds the propose→apply race
     to at most one rotation, so any key admission observed anywhere in
     cache is still reachable via Gen1 at Preload.
  2. The coverage gate (invariant #9) binds every FSM cache read to
     admission's declared preload set — a Declare entry can only be
     read by an order that emitted it.

Given those bounds, Preload-time MirrorTouch is a complete substitute
for per-intent verification: for every Declare plan entry, promote any
Gen1 hit into Gen0 before the handler runs. Applied to T3 above, Gen0
gets seeded with v before strict Del runs and the panic is gone.

Semantic changes on the release/v3.0 baseline:

- AttributeCache.Get is gen0-strict — no gen1 fallback. MirrorTouch has
  already promoted anything that exists anywhere in cache before the
  handler reads.
- AttributeCache.Del returns an error (kv.KV signature updated) and
  refuses to fabricate a tombstone from Gen1. Preload-contract
  violations bubble up loudly per invariant #7 instead of silently
  desyncing.
- writeCacheTombstone writes only the current gen0 byte in 0xFF (the
  gen1 byte is left as harmless stale data, purged on the next
  rotation) — keeping cache mem/disk equal for the same applied index
  (invariant #1).
- protoSnapshotSlot.MirrorTouch is a silent no-op when neither gen has
  the key (previous behavior: touch_noop panic). Under the coverage
  gate that state is legitimate.
- Resolver: CacheGuaranteed and CacheNeedsTouch both emit Declare now;
  CacheMiss + Pebble-load-hit still emits Value(v). The proto's Touch
  variant is retained for wire compat with pre-refactor plans still in
  flight during a rolling upgrade (FSM Preload treats it identically to
  Declare via MirrorTouch).

Docs: docs/technical/architecture/core/plan-intent-verification.md
walks the Path A race, the two guardrails, and the strict-Del site
checklist.

Layered on top of PR #1475 (Needs generic dispatch refactor).
gfyrag pushed a commit that referenced this pull request Jul 2, 2026
…as promote primitive

Fixes a strict AttributeCache.Del contract violation on Delete-like
orders when a concurrent Save + rotation runs between propose and apply:

  T0 Admission: CheckCache(k) → CacheMiss → Declare (no seed)
  T1 Concurrent Save(k, v)         → Gen0[k] = v
  T2 Cache rotation                → Gen1[k] = v, Gen0[k] = ∅
  T3 Delete(k) applies             → strict Del sees Gen0∅ → PANIC

The fix relies on two orthogonal guardrails already in force on
release/v3.0:

  1. Admission-side CacheUnreachable rejects any proposal predicting
     2+ rotations between propose and apply (plan.ErrCacheHorizonExceeded,
     gRPC Unavailable / HTTP 503). This bounds the propose→apply race
     to at most one rotation, so any key admission observed anywhere in
     cache is still reachable via Gen1 at Preload.
  2. The coverage gate (invariant #9) binds every FSM cache read to
     admission's declared preload set — a Declare entry can only be
     read by an order that emitted it.

Given those bounds, Preload-time MirrorTouch is a complete substitute
for per-intent verification: for every Declare plan entry, promote any
Gen1 hit into Gen0 before the handler runs. Applied to T3 above, Gen0
gets seeded with v before strict Del runs and the panic is gone.

Semantic changes on the release/v3.0 baseline:

- AttributeCache.Get is gen0-strict — no gen1 fallback. MirrorTouch has
  already promoted anything that exists anywhere in cache before the
  handler reads.
- AttributeCache.Del returns an error (kv.KV signature updated) and
  refuses to fabricate a tombstone from Gen1. Preload-contract
  violations bubble up loudly per invariant #7 instead of silently
  desyncing.
- writeCacheTombstone writes only the current gen0 byte in 0xFF (the
  gen1 byte is left as harmless stale data, purged on the next
  rotation) — keeping cache mem/disk equal for the same applied index
  (invariant #1).
- protoSnapshotSlot.MirrorTouch is a silent no-op when neither gen has
  the key (previous behavior: touch_noop panic). Under the coverage
  gate that state is legitimate.
- Resolver: CacheGuaranteed and CacheNeedsTouch both emit Declare now;
  CacheMiss + Pebble-load-hit still emits Value(v). The proto's Touch
  variant is retained for wire compat with pre-refactor plans still in
  flight during a rolling upgrade (FSM Preload treats it identically to
  Declare via MirrorTouch).

Docs: docs/technical/architecture/core/plan-intent-verification.md
walks the Path A race, the two guardrails, and the strict-Del site
checklist.

Layered on top of PR #1475 (Needs generic dispatch refactor).
Data-structure refactor of preload needs — no semantic change.

Before: `plan.Needs` held 13 typed key maps (`Ledgers`, `Boundaries`,
`Volumes`, `References`, `Metadata`, `Transactions`, `SinkConfigs`,
`NumscriptVersions`, `NumscriptContents`, `PreparedQueries`,
`LedgerMetadata`, `Indexes`, `IdempotencyKeys`). Adding an attribute
cache meant adding a field on Needs, a case in `Merge()`, a length in
`AttributeKeysCount()`, a bit in `coverage_bits`, and a copy of the
resolver body under a new K generic. `resolveAttributePreload[K, T]`
was instantiated once per attribute type.

After: `Needs.Attributes map[byte]*AttributeSet` keyed by
`dal.SubAttrX` bytes with a `Keys map[string]struct{}` set of
canonical bytes. Admission and the mirror worker use
`p.Add(dal.SubAttrX, key.Bytes())` instead of per-type map
assignments. Resolver dispatch lives in a new `attribute_resolvers.go`
file — one `protoAttrResolver[T]` entry per attribute cache holds the
cache/loader/store triple; the builder iterates over the registration
table.

Two side effects that fall out for free:
- Adding a new attribute cache is a one-line entry in
  `buildAttrResolvers` instead of edits across five files.
- Preload declarations no longer scale with the number of typed maps;
  the resolver is one code path for all attribute caches.

Wire compatibility with pre-refactor plans is preserved:
`CacheGuaranteed` → `Declare`, `CacheNeedsTouch` → `Touch`, `CacheMiss +
bloom/Pebble-absent` → `Declare`, `CacheMiss + Pebble-load-hit` →
`Value(v)`. Cache primitives (`AttributeCache.Get` gen0→gen1 fallback,
`AttributeCache.Del` strict Gen0 contract) and the FSM Preload switch
(`Declare` / `Touch` / `Value` dispatch) stay at `release/v3.0`
semantics.

Also closes a few pre-existing invariant #6 gaps surfaced by the audit:
`DeleteLedger` now preloads the boundary attribute, `DeleteMetadata` /
`DeleteLedgerMetadata` / `DeletePreparedQuery` / `DropIndex` /
`RemoveMetadataFieldType` / `RemoveEventsSink` / `MirrorIngest.DeletedMetadata`
declare coverage for their strict-Del target. These sites all reach
`AttributeCache.Del` which requires Gen0 to hold the entry; the missing
declarations would have silently no-oped the FSM read and desynced
nodes on any concurrent-rotation path.
@gfyrag gfyrag force-pushed the refactor/EN-1242-generic-needs branch from 7226eb9 to 5bb9f23 Compare July 3, 2026 08:11
@gfyrag gfyrag merged commit 83ca35e into release/v3.0 Jul 3, 2026
14 checks passed
@gfyrag gfyrag deleted the refactor/EN-1242-generic-needs branch July 3, 2026 08:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants