Skip to content

feat(solana-indexer): PR 4 — Component structs (skeleton declarations)#4514

Open
tilacog wants to merge 8 commits into
solana-indexer/PR3-bootstrapfrom
solana-indexer/PR4-bootstrap
Open

feat(solana-indexer): PR 4 — Component structs (skeleton declarations)#4514
tilacog wants to merge 8 commits into
solana-indexer/PR3-bootstrapfrom
solana-indexer/PR4-bootstrap

Conversation

@tilacog

@tilacog tilacog commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Description

Adds the indexer/ module with skeleton declarations for the four components that will do the actual work of the indexer. Each struct declares its fields, a constructor, and doc comments describing what it will do; the run bodies are unimplemented! and the behavior lands in later PRs.

The four components and their roles:

  • Ingester: subscribes to the Yellowstone gRPC stream and drains it as fast as updates arrive, forwarding them to the decoder. It does no decoding itself, so the socket never backs up behind slow processing. It is also the single writer of the "latest chain slot" counter that the other components use to know how far the chain has advanced.
  • Decoder: receives the raw stream updates, picks out transactions belonging to the settlement and SolFlow programs, matches each transaction with its corresponding account-update snapshot, and persists the resulting typed events to the store.
  • Partial-event watchdog: some events arrive in two halves (a transaction update and an account update) that don't always land together. The decoder parks the half it has in a map shared with the watchdog; the watchdog periodically scans that map and dead-letters any entry whose other half never showed up within the slot window, recording which half went missing.
  • Finalization worker: rows are first written at the confirmed commitment level. This worker re-checks them against the chain and promotes them to finalized, or marks them rolled back if the transaction disappeared. It uses a cheap batched RPC call for recent rows and falls back to one-call-per-row lookups for rows old enough that the batched method no longer reports them.

The field declarations also pin down how the components talk to each other: the ingester feeds the decoder over a bounded channel, while the decoder and the watchdog share a concurrent map with no message passing between them. That second decision reshaped the partial-event types, so types/channel.rs became types/shared.rs.

Changes

  • Added indexer/ingester.rs: the Ingester struct. Generic over the connection so unit tests can drive it with a mock (the trait-bound approach chosen in PR 3 instead of a third in-crate trait). Also declares the shared latest-chain-slot counter and the constants for reconnect backoff and channel capacity.
  • Added indexer/decoder.rs: the Decoder struct, holding the receiving end of the channel, the shared partial-event map, and the two program ids it filters for.
  • Added indexer/watchdog.rs: the PartialEventWatchdog struct, holding the store and its view of the shared partial-event map.
  • Added indexer/finalization.rs: the FinalizationWorker struct, with module docs explaining the two promotion flows and the constants that bound them (finalization window, batch size, retention horizon).
  • Renamed types/channel.rs to types/shared.rs and reworked the partial-event types: instead of payloads sent over a channel, a partial event is now a key (slot + signature) and a value (whichever half arrived first) in the map shared by the decoder and the watchdog.
  • Dependencies: added yellowstone-grpc-client to the workspace (source of the connection trait the ingester is generic over), plus dashmap and a few tokio features to the crate.

How to test

  1. cargo check -p solana-indexer
  2. cargo clippy --locked -p solana-indexer --all-features --all-targets -- -D warnings

No unit tests are included: the structs declare shape only, and every run body is unimplemented! until the behavior PRs.


This is a follow-up PR to #4508

@tilacog tilacog requested a review from a team as a code owner June 10, 2026 17:33

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the skeleton structure for the Solana settlement indexer components, including the Ingester, Decoder, PartialEventWatchdog, and FinalizationWorker, along with necessary dependency updates and shared type definitions. Feedback on the changes suggests replacing the global static LATEST_CHAIN_SLOT with a shared Arc passed via constructors to avoid process-wide shared mutable state and potential race conditions during parallel testing.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

/// The sole writer is the ingester, on every slot-filter message. Anchors the
/// partial-event watchdog and the finalization worker. Cold start is zero; the
/// watchdog skips its comparison on the first tick.
pub static LATEST_CHAIN_SLOT: AtomicU64 = AtomicU64::new(0);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using a global static AtomicU64 for LATEST_CHAIN_SLOT introduces shared mutable state across the entire process. This causes race conditions and flakiness when running unit/integration tests in parallel (Cargo's default behavior). It also prevents running multiple indexer instances in the same process.

Actionable Suggestion:
Remove the global static and instead pass an Arc<AtomicU64> (or a shared state struct) to the constructors of Ingester, Decoder, PartialEventWatchdog, and FinalizationWorker.

References
  1. Focus exclusively on identifying missing edge cases, potential race conditions, or logic that deviates from the PR's stated goals. (link)

@github-actions

Copy link
Copy Markdown

This pull request has been marked as stale because it has been inactive a while. Please update this pull request or it will be automatically closed.

@github-actions github-actions Bot added the stale label Jun 18, 2026
Comment on lines +27 to +28
/// Typical number of slots for a transaction to finalize (~12.8 s). The
/// promotion pass skips rows fresher than this.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does typical mean here? In which circumstances are 32 slots NOT correct?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to clarify the doc.


/// Typical number of slots for a transaction to finalize (~12.8 s). The
/// promotion pass skips rows fresher than this.
#[allow(dead_code)]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using #[allow(lint)] please use #[expect(lint)]. The reason is that clippy will automatically generate a warning when the lint is no longer violated which forces you to clean up stale lint exceptions.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


/// Transaction finalization worker. See the module docs for the two flows it
/// runs.
pub struct FinalizationWorker<St: Store, R: SolanaClient> {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO going down the route of generics instead of Arc<dyn Trait> is a slippery slope. Type parameters need to be propagated throughout the code base and the tiny overhead of chasing a pointer when using dyn trait hardly matters if the component issues network requests.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Switched Store and SolanaClient to #[async_trait] (in #4508), and all the components now hold Arc<dyn Store> / Arc<dyn SolanaClient> instead of generic params. Also dropped the #![allow(async_fn_in_trait)] workaround. Lines up with the rest of the repo.

Comment on lines +16 to +19
/// The sole writer is the ingester, on every slot-filter message. Anchors the
/// partial-event watchdog and the finalization worker. Cold start is zero; the
/// watchdog skips its comparison on the first tick.
pub static LATEST_CHAIN_SLOT: AtomicU64 = AtomicU64::new(0);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason that this value does not live inside a specific component that gets accessed via a getter?
Also the API is currently fragile. The type is pub and allows anyone to access it. A safer API would be a new type wrapping the counter with an update function that is only accessible in this module for example.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Half of this lands in #4549, where the global static becomes an Arc<AtomicU64> owned by the Ingester, so it's no longer a free-floating global. The newtype-with-restricted-write part isn't done yet though, it's still a raw Arc<AtomicU64> any holder can write to. I'll wrap it in a small ChainTip type in #4549 then.

Comment on lines +25 to +26
/// Capacity of the channel from the ingester to the decoder.
pub const INGEST_TO_DECODER_CAPACITY: usize = 1024;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and a few other consts seems like it should be configurable by a config file.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These move to the config module in #4549.

Comment on lines +25 to +26
/// Capacity of the channel from the ingester to the decoder.
pub const INGEST_TO_DECODER_CAPACITY: usize = 1024;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why the ingester is necessary? Why is it better to have it forward the events to a different channel? If the other channel it pushed into does not get cleared fast enough the backpressure will end up here anyway.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UUIC, the ingester exists to keep reading from the stream without pausing since yellowstone disconnects slow readers, and the slowest part of our work is writing to the DB. If one loop reads a message and then writes it to the DB, it stops reading while it writes, messages pile up, and the server drops us. A bigger buffer in the client shouldn't help, because a buffer still needs something to keep emptying it, and that same loop is busy writing.

You're right that a long overload still backs up to the ingester and drops us anyway. The buffer only smooths over short slow spikes, like a burst in one slot or one slow write. So the split isn't about speed or handling more load. It only keeps us from getting disconnected during short slow moments.

@tilacog please correct me if I am wrong.

Comment on lines +36 to +38
/// Sends `StreamUpdate` to the decoder. Should be bounded to
/// `RECONNECT_BACKOFF_CAP` entries.
pub tx: Sender<StreamUpdate>,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment says the size of the channel should be bounded to a unit of time which does not seem to make sense.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, the doc now references INGEST_TO_DECODER_CAPACITY (the channel capacity), not the backoff duration.

Comment on lines +36 to +40
/// Shared in-memory map of partial events keyed by `PartialEventKey`.
///
/// The decoder holds a clone of this `Arc` and both inserts and removes
/// halves as it processes them.
pub partials: Arc<DashMap<PartialEventKey, PartialEvent>>,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the reasoning for using channels for some of the communication but not for this partials map?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both the decoder and the watchdog need to look entries up and remove them by key, which a stream can't do. The decoder removes the matching half by (slot, signature) once the second one arrives, and the watchdog scans for the ones that waited too long and drops those. So I guess this map is reasonable here.

cc @tilacog

Comment on lines +16 to +19
/// The sole writer is the ingester, on every slot-filter message. Anchors the
/// partial-event watchdog and the finalization worker. Cold start is zero; the
/// watchdog skips its comparison on the first tick.
pub static LATEST_CHAIN_SLOT: AtomicU64 = AtomicU64::new(0);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be documented that this slot number is completely unrelated to the indexing progress of the system. Obviously depends on the rest of the indexer logic but it sounds like a counter that tracks fully processed slots might be more useful.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documented it. The doc now says LATEST_CHAIN_SLOT is the chain tip, not indexing progress. The processed-slots counter you're describing already exists separately, it's the watermark in solana.indexer_state written by the decoder.

Comment thread crates/solana-indexer/src/indexer/decoder.rs Outdated
Comment thread crates/solana-indexer/src/types/shared.rs Outdated
Comment thread crates/solana-indexer/src/indexer/ingester.rs
@squadgazzz squadgazzz removed the stale label Jun 26, 2026
…tstrap

Resolve conflicts from the updated PR3 base:
- keep dashmap (PR4) plus bytes/derive_more (PR3), drop observe/prometheus (metrics.rs gone)
- shared.rs: keep PR4's PartialEventKey/PartialEvent rework, adopt PR3's Slot and pub(crate)
- types/mod.rs: keep both the shared (PR4) and slot (PR3) modules
- lower the four indexer structs to pub(crate) to match PR3's pub(crate) types, and add expect(dead_code)/expect(unused_imports) for the not-yet-wired scaffolding
@socket-security

socket-security Bot commented Jun 26, 2026

Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedyellowstone-grpc-client@​13.1.010010093100100

View full report

@socket-security

socket-security Bot commented Jun 26, 2026

Copy link
Copy Markdown

Warning

Review the following alerts detected in dependencies.

According to your organization's Security Policy, it is recommended to resolve "Warn" alerts. Learn more about Socket for GitHub.

Action Severity Alert  (click "▶" to expand/collapse)
Warn High
Obfuscated code: cargo openssl is 90.0% likely obfuscated

Confidence: 0.90

Location: Package overview

From: ?cargo/sqlx@0.8.6cargo/openssl@0.10.75

ℹ Read more on: This package | This alert | What is obfuscated code?

Next steps: Take a moment to review the security alert above. Review the linked package source code to understand the potential risk. Ensure the package is not malicious before proceeding. If you're unsure how to proceed, reach out to your security team or ask the Socket team for help at support@socket.dev.

Suggestion: Packages should not obfuscate their code. Consider not using packages with obfuscated code.

Mark the package as acceptable risk. To ignore this alert only in this pull request, reply with the comment @SocketSecurity ignore cargo/openssl@0.10.75. You can also ignore all packages with @SocketSecurity ignore-all. To ignore an alert for all future pull requests, use Socket's Dashboard to change the triage state of this alert.

View full report

- PartialEventKey uses Slot instead of u64, matching StreamUpdate
- finalization: clarify FINALIZATION_WINDOW_SLOTS doc, drop the per-const
  #[allow(dead_code)] in favor of the module-level #![expect(dead_code)]
Replace the generic type params on the four components with Arc<dyn Store> /
Arc<dyn SolanaClient> fields, per review feedback. Avoids propagating type
params through the codebase, and dyn dispatch is negligible for these
I/O-bound components. Ingester keeps its GrpcConnector generic for stream
mocking.
- note LATEST_CHAIN_SLOT is the chain tip, not indexing progress (the watermark
  in solana.indexer_state is the separate processed-slot counter)
- fix the decoder channel doc to reference INGEST_TO_DECODER_CAPACITY, not the
  unrelated RECONNECT_BACKOFF_CAP duration
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants