Problem
Under an active search, DuckBackend.initialState (packages/buckaroo-duckdb-node/src/backend.ts) re-runs SUMMARIZE over the search-filtered relation, so the scalar summary stats and filtered_rows reflect the matches. The histogram pinned row is not filtered: it is computed over the unfiltered relation —
computeHistograms(this.source, plan.renamedRelation(this.effectiveSql), plan, sd, this.totalRows) // backend.ts:177
It passes effectiveSql (search excluded) and this.totalRows (the unfiltered count, used for cat_pop normalization).
Impact
With a search applied, the grid shows filtered counts/means/quantiles but histogram bars for the full table — an internally inconsistent pinned-stats row. The pandas Search command re-runs the whole stats pipeline (histograms included) on the filtered df, so the DuckDB backend diverges from the pandas/polars backends here.
Fix
Compute histograms over the same relation the other stats use under search — searchEffectiveSql(plan) — and normalize by the filtered row count rather than this.totalRows. Add a spike test asserting the histogram bars change when a search is applied (the current histogram spike tests don't exercise search).
Context
Surfaced while rebasing #940 (search) onto main after the histogram feature landed; the two features were built in parallel and their interaction is currently untested. Code at backend.ts:177.
Problem
Under an active search,
DuckBackend.initialState(packages/buckaroo-duckdb-node/src/backend.ts) re-runs SUMMARIZE over the search-filtered relation, so the scalar summary stats andfiltered_rowsreflect the matches. The histogram pinned row is not filtered: it is computed over the unfiltered relation —It passes
effectiveSql(search excluded) andthis.totalRows(the unfiltered count, used for cat_pop normalization).Impact
With a search applied, the grid shows filtered counts/means/quantiles but histogram bars for the full table — an internally inconsistent pinned-stats row. The pandas
Searchcommand re-runs the whole stats pipeline (histograms included) on the filtered df, so the DuckDB backend diverges from the pandas/polars backends here.Fix
Compute histograms over the same relation the other stats use under search —
searchEffectiveSql(plan)— and normalize by the filtered row count rather thanthis.totalRows. Add a spike test asserting the histogram bars change when a search is applied (the current histogram spike tests don't exercise search).Context
Surfaced while rebasing #940 (search) onto
mainafter the histogram feature landed; the two features were built in parallel and their interaction is currently untested. Code atbackend.ts:177.