Skip to content

[wip] db/state: inv_index merge remove 1 intermediate buffer to reduce peak memory#20597

Open
AskAlexSharov wants to merge 12 commits intomainfrom
alex/oom_merge_35
Open

[wip] db/state: inv_index merge remove 1 intermediate buffer to reduce peak memory#20597
AskAlexSharov wants to merge 12 commits intomainfrom
alex/oom_merge_35

Conversation

@AskAlexSharov
Copy link
Copy Markdown
Collaborator

@AskAlexSharov AskAlexSharov commented Apr 16, 2026

for: #20560

Summary

  • Removes keyBuf/valBuf buffering pattern in InvertedIndexRoTx.mergeFiles.
  • Previously the loop deferred each write by one iteration, meaning two large EF byte slices were live simultaneously: valBuf (previous key, pending write) and lastVal (current key, just merged). For hot keys (e.g. popular ERC-20 Transfer topic, popular contract address) each slice can be 50–200 MB.
  • Now writes lastKey/lastVal directly after merge and resets lastVal = lastVal[:0], halving peak memory for the worst-case key.
  • Adds TestInvIndexMergeFiles_SharedKey: merges 4 .ef files where key-1 appears in every file (module=1, aggStep=32 forces EliasFano path), then reads the merged file directly and verifies the full txNum sequence.

It's not enough to fully eliminate OOM - but it's step forward

Commit 835dd9b in ethpandaops/ethereum-package introduced GpuConfig,
a Starlark built-in that requires Kurtosis CLI >=1.18.1. CI installs
an older version so Starlark eval fails. Pin to e07503d (the commit
just before the regression) until CI is upgraded to Kurtosis 1.18.1.
Commit 835dd9b on main introduced GpuConfig, a Starlark built-in that
requires Kurtosis CLI >=1.18.1. CI installs an older version so Starlark
eval fails at startup. Pin to the 6.1.0 release tag which has GLOAS
support but not the breaking GpuConfig. Unpin once CI is on 1.18.1.
@AskAlexSharov AskAlexSharov requested a review from Copilot April 16, 2026 03:28
@AskAlexSharov AskAlexSharov changed the title [wip] db/state: eliminate keyBuf/valBuf delay in inverted index merge to reduce peak memory [wip] db/state: inv_index merge remove 1 intermediate buffer to reduce peak memory Apr 16, 2026
@AskAlexSharov AskAlexSharov requested a review from wmitsuda April 16, 2026 03:30
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces peak memory usage during inverted index .ef file merges by removing the 1-iteration delayed keyBuf/valBuf buffering in InvertedIndexRoTx.mergeFiles, and adds a regression test covering the “single hot key appears in every input file” scenario.

Changes:

  • Write merged lastKey/lastVal immediately in InvertedIndexRoTx.mergeFiles (removes delayed buffering that kept two large merged value slices live).
  • Adjust HistoryRoTx.mergeFiles to use stack-allocated SequenceReader/SequenceIterator values.
  • Add TestInvIndexMergeFiles_SharedKey to validate correctness when one key spans all merged .ef inputs.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
db/state/merge.go Removes 1-step buffering in inverted index merge loop to reduce peak memory; minor iterator reset adjustment in history merge.
db/state/merge_test.go Adds a regression test that merges multiple .ef files sharing a single key and validates the merged txNum sequence.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread db/state/merge_test.go
Comment thread db/state/merge_test.go
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Reduces peak memory during inverted-index .ef merges by removing the “one-iteration-behind” buffering pattern in InvertedIndexRoTx.mergeFiles, and adds a regression test for the worst-case “same hot key across all files” merge scenario. Also pins a Kurtosis CI dependency to avoid Starlark evaluation breakage.

Changes:

  • Write merged lastKey/lastVal immediately in InvertedIndexRoTx.mergeFiles to avoid holding two large merged value buffers at once.
  • Add TestInvIndexMergeFiles_SharedKey to validate correct merge output when one key is present in every input .ef file.
  • Pin ethereum_package_branch for the glamsterdam Kurtosis suite to a known-good version.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
db/state/merge.go Removes intermediate buffering in inverted-index merge; minor iterator reset change in history merge to use value iterator/reader.
db/state/merge_test.go Adds a focused regression test for shared-key .ef merges via the EliasFano merge path.
.github/workflows/test-kurtosis-assertoor.yml Pins the ethereum-package version for one suite to avoid CI breakage from Kurtosis CLI requirements.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread db/state/merge_test.go
@AskAlexSharov AskAlexSharov added this pull request to the merge queue Apr 18, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants