Skip to content

feat(retain): make structured-unit chunk limit configurable#2139

Open
Sanderhoff-alt wants to merge 1 commit into
vectorize-io:mainfrom
Sanderhoff-alt:feat/retain-structured-unit-max-chars
Open

feat(retain): make structured-unit chunk limit configurable#2139
Sanderhoff-alt wants to merge 1 commit into
vectorize-io:mainfrom
Sanderhoff-alt:feat/retain-structured-unit-max-chars

Conversation

@Sanderhoff-alt

@Sanderhoff-alt Sanderhoff-alt commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds retain_structured_chunk_size as an explicit retain chunking configuration for structured inputs such as JSONL and conversation arrays.

When unset, structured chunks now follow the effective retain_chunk_size instead of using the hidden retain_chunk_size * 1.5 overflow allowance. Operators can tune structured inputs independently by setting retain_structured_chunk_size explicitly.

Closes #2136.

Motivation

The previous structured chunking behavior allowed a single JSONL line or conversation turn to exceed retain_chunk_size by an implicit 1.5x factor. That made the effective size limit difficult to reason about and impossible to tune directly.

This PR replaces that hidden heuristic with a named configuration field:

  • Default behavior stays simple: if retain_structured_chunk_size is unset, the structured chunk size equals the effective retain_chunk_size.
  • Smaller structured caps are supported: set retain_structured_chunk_size below retain_chunk_size to split very large JSONL lines or conversation turns sooner.
  • Larger structured caps are also supported: set retain_structured_chunk_size above retain_chunk_size to preserve larger structured records whole.
  • Validation rejects non-positive or non-integer chunk sizes while keeping the existing retain_max_completion_tokens > retain_chunk_size check unchanged.

Changes

  • Added HINDSIGHT_API_RETAIN_STRUCTURED_CHUNK_SIZE / retain_structured_chunk_size.
  • Threaded the setting through retain extraction, streaming retain, append/prepend chunking, and retain strategy application.
  • Validated resolved global, tenant, bank, and retain strategy configs so chunk size values are checked after hierarchy and strategy overrides are applied.
  • Updated bank config API models, bank templates, MCP tool docs, generated OpenAPI artifacts, and generated clients.
  • Updated the control-plane retain configuration UI to expose the new field for default retain settings and named strategies.
  • Updated the Rust CLI bank set-config command to expose --retain-chunk-size and --retain-structured-chunk-size.
  • Updated .env.example files and generated docs/skill references.

Testing

  • git diff --check
  • ./scripts/hooks/lint.sh
  • cd hindsight-dev && uv run cli-coverage-check
  • Commit hook: generate-docs-skill.sh
  • Commit hook: lint.sh

Additional targeted coverage was added for:

  • JSONL and conversation chunking with explicit and default structured chunk sizes.
  • Config validation for positive chunk sizes and smaller structured caps.
  • Hierarchical config resolution, including bank overrides, null tombstones, and strategy overrides.
  • Bank template configurable fields.
  • MCP tool configuration documentation.
  • Python client payload construction for clearing the structured chunk override.
  • Control-plane retain strategy serialization for inherited, explicit-null, and numeric structured chunk sizes.

@nicoloboschi

Copy link
Copy Markdown
Collaborator

One naming suggestion: retain_structured_unit_max_chars / HINDSIGHT_API_RETAIN_STRUCTURED_UNIT_MAX_CHARS is a bit verbose and breaks from the existing retain_chunk_size convention.

Consider retain_structured_chunk_size (HINDSIGHT_API_RETAIN_STRUCTURED_CHUNK_SIZE):

  • Keeps the important "structured-only" signal (applies to JSONL lines / conversation turns, not plain text).
  • Mirrors the sibling retain_chunk_size by using ...chunk_size rather than ..._unit_max_chars, so the pair reads naturally (retain_chunk_size + retain_structured_chunk_size).

Purely a naming nit — the approach and the absolute-cap semantics look good.

@Sanderhoff-alt Sanderhoff-alt force-pushed the feat/retain-structured-unit-max-chars branch 3 times, most recently from d025e9d to 9eac18e Compare June 11, 2026 16:44
Add retain_structured_chunk_size as an explicit retain chunking knob
for structured inputs. When unset, structured inputs follow the
effective retain_chunk_size instead of the hidden 1.5x overflow factor.

Thread the setting through retain extraction, append/prepend chunking,
bank config resolution, templates, MCP docs, maintained clients,
generated OpenAPI artifacts, the control-plane retain strategy UI, and
the Rust CLI set-config command.

Validate retain_chunk_size and retain_structured_chunk_size as positive
integers while allowing either value to be smaller. Keep the existing
retain_max_completion_tokens check scoped to retain_chunk_size.

Preserve upstream validation details for client errors through the
control-plane proxy so UI alerts and toasts can show concrete
configuration errors without exposing server-side failure details.

Update chunking, config, hierarchical config, template, MCP, client
payload, control-plane serialization, SDK-response, API-client, and
retain UI validation tests for the new behavior.
@Sanderhoff-alt Sanderhoff-alt force-pushed the feat/retain-structured-unit-max-chars branch from 9eac18e to 23fcdd2 Compare June 11, 2026 16:51
@Sanderhoff-alt

Copy link
Copy Markdown
Contributor Author

@nicoloboschi
Thanks for the suggestion. I agree the retain_structured_chunk_size naming fits better with the existing retain_chunk_size convention while still making the structured-only behavior clear.

I’ve updated the PR to use retain_structured_chunk_size / HINDSIGHT_API_RETAIN_STRUCTURED_CHUNK_SIZE and --retain-structured-chunk-size throughout the API, config, CLI, clients, docs, and control-plane UI. I kept the existing description semantics focused on the actual behavior: the value controls the maximum size for a single JSONL line or conversation turn to stay whole, and defaults to retain_chunk_size when unset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow customizable chunk overflow factor or max chunk size

2 participants