Skip to content

feat(retain): configurable chunk overflow factor (#2136)#2141

Closed
nicoloboschi wants to merge 1 commit into
mainfrom
feat/configurable-chunk-overflow-factor-2136
Closed

feat(retain): configurable chunk overflow factor (#2136)#2141
nicoloboschi wants to merge 1 commit into
mainfrom
feat/configurable-chunk-overflow-factor-2136

Conversation

@nicoloboschi

Copy link
Copy Markdown
Collaborator

Closes #2136.

What

Makes the chunk overflow factor configurable. Previously hardcoded at 1.5, this factor controls how far a single JSONL line or conversation turn may exceed the chunk-size budget while still being kept whole (before falling back to splitting it as text).

New flag: HINDSIGHT_API_RETAIN_CHUNK_OVERFLOW_FACTOR (default 1.5, hierarchical / per-bank overridable).

Why

Per the issue: when ingesting JSONL where some messages are larger but still comfortably within the extraction model's context, the only way to avoid splitting them today is to raise the global chunk size. Raising the overflow factor instead (e.g. to 3.0) keeps longer structured messages intact in a single chunk — preserving role/time/filename context — while keeping a smaller default chunk size everywhere else.

How

  • config.py — env constant, default, dataclass field, _CONFIGURABLE_FIELDS entry, from_env() wiring, and _parse_chunk_overflow_factor() (validates >= 1.0, fails fast otherwise).
  • fact_extraction.pychunk_text / _chunk_conversation / _chunk_jsonl take an overflow_factor param (default = the existing module constant). The three bank-config-driven call sites pass config.retain_chunk_overflow_factor.
  • orchestrator.py — both chunk_text call sites thread the resolved factor through.
  • memory_engine.py_resolve_retain_chunk_size_resolve_retain_chunk_params, returning a RetainChunkParams dataclass (chunk size + factor) so the chunk-offset counting matches the orchestrator's actual chunking when an oversized unit is present.
  • Docs row added to configuration.md (+ regenerated docs skill).

Tests

  • test_chunking.py — a raised overflow_factor keeps a JSONL line / conversation turn whole that the default would split.
  • test_config_validation.py — default, env override, and < 1.0 rejection.
  • test_hierarchical_config.py — new field is configurable; count assertion 38 → 39.

All affected non-DB tests pass; lint + ty clean.

The factor that keeps a single JSONL line / conversation turn whole when
it overflows the chunk-size budget was hardcoded at 1.5x. Expose it as
HINDSIGHT_API_RETAIN_CHUNK_OVERFLOW_FACTOR (default 1.5, hierarchical /
per-bank) so longer structured messages can stay within a single chunk
without globally inflating RETAIN_CHUNK_SIZE.

chunk_text / _chunk_conversation / _chunk_jsonl now take an overflow_factor
param (default = the module constant); the bank-config-driven call sites in
the orchestrator and fact extractor pass config.retain_chunk_overflow_factor.
_resolve_retain_chunk_size becomes _resolve_retain_chunk_params, returning a
RetainChunkParams dataclass (size + factor) so the chunk-offset counting in
memory_engine matches the orchestrator's actual chunking when an oversized
unit is present.

Closes #2136
@nicoloboschi nicoloboschi force-pushed the feat/configurable-chunk-overflow-factor-2136 branch from 99bf3fd to 1896f2f Compare June 11, 2026 14:54
@nicoloboschi

Copy link
Copy Markdown
Collaborator Author

Closing in favor of #2139, which already closes #2136 with a more complete implementation (typed bank-template API, OpenAPI, generated clients, control-plane UI, CLI). #2139 uses an absolute char cap (retain_structured_unit_max_chars) rather than this PR's multiplier — the issue offered both shapes and the absolute cap is easier to reason about.

@nicoloboschi nicoloboschi deleted the feat/configurable-chunk-overflow-factor-2136 branch June 11, 2026 15:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow customizable chunk overflow factor or max chunk size

1 participant