Skip to content

fix(ctrl): read ledger-stitched queries from a single snapshot#1445

Merged
Azorlogh merged 8 commits into
release/v3.0from
fix/EN-1408-single-snapshot-reads
Jul 2, 2026
Merged

fix(ctrl): read ledger-stitched queries from a single snapshot#1445
Azorlogh merged 8 commits into
release/v3.0from
fix/EN-1408-single-snapshot-reads

Conversation

@Azorlogh

Copy link
Copy Markdown
Contributor

Migrated from formancehq/ledger-v3-poc#589 (rebased onto release/v3.0).

Single-snapshot reads (EN-1408 follow-up)

Sweeps the remaining read paths that stitch a LedgerInfo read together with per-entity attribute reads so they all read from one Pebble snapshot — the canonical pattern from the GetLedger/ListLedgers fix.

Converted (one NewReadHandle() up front, all reads routed through it): GetTransactionFrom, GetAccount, GetLedgerStats, AnalyzeAccounts, AggregateVolumes, InspectIndex, ListLogs (also fixes an inverted handle), ListPreparedQueries (also NewDirectReadHandleNewReadHandle), GetNumscript, ListNumscripts.

This is defense-in-depth: none of these combined two snapshots into an observable inconsistency today (the second read only supplied the immutable ledger name), so there's no new red e2e test — the existing get_ledger_snapshot_test.go remains the coverage. Pinning makes a future change that embeds LedgerInfo content safe by construction. Left intentionally unchanged: GetMetadataSchemaStatus (single read), GetIndexStatus / ListIndexes (intentional cross-store / streaming reads).

Transaction receipt correctness on routed reads

The receipt for GetTransaction is computed from the transaction's creation log. Three fixes (driven by review):

  1. Open the receipt snapshot after the transaction read — opening it before ctrl.GetTransaction's routed/linearizable read could pin a state predating the transaction's creation log → empty receipt for an existing tx.
  2. Reuse the serving node's receipt for forwarded reads — when the read is routed to the leader, re-deriving the receipt from a (possibly-lagging) local snapshot is wrong. BucketGrpcClient was discarding resp.Receipt; now it's surfaced.
  3. Authoritative-or-absent receiptController.GetTransaction returns *string: nil for a locally-served read (the gRPC layer signs from the post-read snapshot), non-nil and authoritative (possibly empty, e.g. a reversal) when forwarded — used as-is, never re-derived. Internal plumbing only; the wire receipt field stays a string.

Mocks regenerated for the interface change.

Azorlogh added 4 commits June 29, 2026 15:08
Every query that returns data stitched from a LedgerInfo read plus per-entity attribute reads now opens one Pebble snapshot (NewReadHandle) up front and routes the existence/name read and the payload reads through it — the same pattern as the GetLedger/ListLedgers fix (#583). Converted: GetTransactionFrom, GetAccount, GetLedgerStats, AnalyzeAccounts, AggregateVolumes, InspectIndex, ListLogs (fixes an inverted handle), ListPreparedQueries (also NewDirectReadHandle->NewReadHandle), GetNumscript, ListNumscripts, and computeTransactionReceipt's live path.

Defense-in-depth: unlike GetLedger (whose response packs two coupled mutable values) none of these combine two snapshots into an observable inconsistency today — the second read only supplied the immutable ledger name — so no new e2e test is added (none could be made red); the existing get_ledger_snapshot_test.go remains the coverage. GetMetadataSchemaStatus (single read), GetIndexStatus and ListIndexes (intentional cross-store / streaming reads) are left as-is.

Refs EN-1408
The previous commit opened the receipt's snapshot before impl.ctrl.GetTransaction, whose routed/linearizable read (ReadIndex barrier, possible leader forwarding) can land at a later committed state. A transaction whose creation log commits in that gap would be returned by GetTransaction but missed by the earlier receipt snapshot, yielding an empty receipt for an existing transaction. Open the snapshot after the read instead (and only when receipt signing is enabled). Caught by NumaryBot review on #589.
The previous fix opened the receipt snapshot after the transaction read, but that only helps locally-served reads. When GetTransaction is routed to the leader (Consistency: leader, or a linearizable read falling back to the leader), the transaction is read from the leader's fresh state while the receipt is still re-derived from a local snapshot that may lag — yielding an empty receipt for an existing transaction.

The leader already signs the receipt and returns it; we were discarding it. Controller.GetTransaction now also returns that upstream receipt (empty for a locally-served read, where the gRPC layer signs from the post-read snapshot). The adapter reuses a non-empty forwarded receipt and only re-derives locally otherwise. Regenerated the Controller/Backend mocks for the new signature.

Caught by NumaryBot/codex review on #589 (finding fa4181052084ad35).
fwdReceipt != "" conflated two cases: a forwarded read of a receipt-less transaction (notably a reversal, whose creation log is a RevertedTransaction so the leader returns an empty receipt) fell into local re-derivation on a possibly-stale follower — wasteful in the common case, and an error if that follower's snapshot lacks the ledger.

Controller.GetTransaction now returns *string: nil for a locally-served read (the gRPC layer signs from the post-read snapshot), non-nil and authoritative (possibly empty, e.g. a reversal) when forwarded to a signing node — used as-is, never re-derived. Internal plumbing only; the wire receipt field stays a string. Regenerated the Controller/Backend mocks.

Addresses NumaryBot finding 865ba8aa on #589.
@Azorlogh Azorlogh requested a review from gfyrag as a code owner June 29, 2026 15:09
@coderabbitai

coderabbitai Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

🗂️ Base branches to auto review (1)
  • main

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ed15a819-5de8-4e68-ab87-ea22c6389d7c

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/EN-1408-single-snapshot-reads

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@NumaryBot

NumaryBot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

✅ Approve — automated review

The PR consistently routes ledger-stitched queries through a single snapshot/read handle, eliminating snapshot inconsistency across the affected read paths. The one previously confirmed major issue — forwarded receipts being silently dropped when the local receipt signer is disabled — was identified in earlier automated review, resolved in commit 5c9f5c5, and verified with an end-to-end test before the current review cycle. All independent reviewers found no regressions or new issues in the updated code. There are no remaining blocker or major findings; the PR is ready to merge.

No findings.

@NumaryBot NumaryBot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NumaryBot posted 1 new inline finding.

Summary: #1445 (comment)

Comment thread internal/adapter/grpc/server_bucket.go Outdated
The receiptSigner != nil guard wrapped the forwarded-receipt reuse, so a node with signing disabled that routed GetTransaction to a signing leader dropped the leader's authoritative receipt — making receipt presence depend on which node the client contacted under heterogeneous config. Relaying an already-signed token needs no local signer; gate only local derivation on it.

Addresses NumaryBot finding 59afc7b354ba2988 on #1445.

@NumaryBot NumaryBot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NumaryBot posted 1 new inline finding.

Summary: #1445 (comment)

Comment thread internal/adapter/grpc/server_bucket.go
@codecov

codecov Bot commented Jun 29, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 44.28571% with 39 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.82%. Comparing base (aa373d5) to head (d3a8450).
⚠️ Report is 2 commits behind head on release/v3.0.

Files with missing lines Patch % Lines
internal/application/ctrl/controller_default.go 36.36% 9 Missing and 19 partials ⚠️
internal/adapter/grpc/server_bucket.go 64.70% 1 Missing and 5 partials ⚠️
internal/adapter/grpc/client_bucket.go 50.00% 0 Missing and 2 partials ⚠️
internal/bootstrap/controller_routed.go 50.00% 1 Missing and 1 partial ⚠️
internal/adapter/http/handlers_get_transaction.go 0.00% 0 Missing and 1 partial ⚠️

❌ Your patch status has failed because the patch coverage (44.28%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@               Coverage Diff                @@
##           release/v3.0    #1445      +/-   ##
================================================
- Coverage         73.66%   71.82%   -1.84%     
================================================
  Files               388      388              
  Lines             39747    39917     +170     
================================================
- Hits              29278    28671     -607     
- Misses             7951     8000      +49     
- Partials           2518     3246     +728     
Flag Coverage Δ
e2e 71.82% <44.28%> (-1.84%) ⬇️
scenario 71.82% <44.28%> (-1.84%) ⬇️
unit 71.82% <44.28%> (-1.84%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@gfyrag gfyrag left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Les fixes de receipt sont réels (Bug A: receipt forwardé jeté; Bug D: gate signer trop englobante) et le tri-state *string encode proprement l'autorité vs l'absence. La sweep single-snapshot est du hardening défendable — pas de bug observable aujourd'hui mais by-construction safe pour toute future lecture de champ mutable de LedgerInfo.

Deux remarques non bloquantes:

  1. Coverage gap sur les receipts. Aucun test rouge ne couvre les bugs corrigés côté serveur:

    • Follower en retard forward au leader et surface le receipt (Bug A).
    • Node sans signer relaye un receipt de leader signeur (Bug D).
    • Reversal forwardé garde receipt="" non-nil (Bug C, tri-state).

    Seul client_bucket_test.go valide le round-trip côté client. codecov/patch a échoué. Ces bugs ont été trouvés par NumaryBot, pas par test — un futur rework du routing pourrait les réintroduire silencieusement. Un test d'intégration deux-nodes (follower lag + Consistency: leader) rendrait la zone stable.

  2. Sweep sans garde-fou. 10 handlers modifiés pour un pattern qu'il faudrait maintenant tenir. Serait utile de documenter dans docs/technical/architecture/ la règle « toute lecture qui stitch LedgerInfo avec un attribut ouvre UN handle », voire un lint qui interdit GetLedgerByName(ctx, ctrl.store, …) en dehors d'exceptions listées (GetMetadataSchemaStatus, GetIndexStatus, ListIndexes).

Azorlogh added 2 commits July 2, 2026 13:44
Configures only the leader with a receipt-signing key; a Consistency: leader GetTransaction issued against a follower with no signer must relay the leader-signed receipt rather than drop it. Red against the pre-fix signer-gated relay, green after. Adds a WithNodeInstruments helper for per-node cluster configuration.
Records the guardrail behind the read-path sweep: any controller read that stitches a LedgerInfo lookup with attribute reads opens ONE ReadHandle and routes everything through it, including GetLedgerByName. Covers the canonical pattern, the torn-read rationale, the transaction-receipt rules, and the deliberate exceptions (GetMetadataSchemaStatus, GetIndexStatus, ListIndexes).
@Azorlogh Azorlogh requested a review from paul-nicolas as a code owner July 2, 2026 14:02
@Azorlogh

Azorlogh commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

Bug D now has a red-verified e2e test (get_transaction_receipt_forwarded_test.go), and the single-snapshot rule is documented in core/read-snapshot-consistency.md.

The other two can't get a meaningful red test: Bug A needs deterministic follower lag, but the test gateway isn't in the Raft path (nodes talk on real ports), and Bug C isn't observable — a forwarded reversal returns receipt="" whether it's relayed or re-derived locally.

@NumaryBot NumaryBot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NumaryBot review complete: no remaining inline findings.

Resolved 1 stale NumaryBot review thread (1 fixed, 0 outdated).

Summary: #1445 (comment)

…ingle-snapshot-reads

# Conflicts:
#	docs/technical/architecture/README.md
#	docs/technical/architecture/subsystems/consensus/read-snapshot-consistency.md
@Azorlogh Azorlogh merged commit 5a10939 into release/v3.0 Jul 2, 2026
10 checks passed
@Azorlogh Azorlogh deleted the fix/EN-1408-single-snapshot-reads branch July 2, 2026 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants