fix(api): get_entity returns linked observations instead of empty list#2154
Open
cdbartholomew wants to merge 1 commit into
Open
fix(api): get_entity returns linked observations instead of empty list#2154cdbartholomew wants to merge 1 commit into
cdbartholomew wants to merge 1 commit into
Conversation
`MemoryEngine.get_entity()` returned `"observations": []` hardcoded, so
the entity-detail endpoint (`GET /v1/default/banks/{bank_id}/entities/
{entity_id}`) always reported zero observations regardless of how many
actually referenced the entity.
The traversal needed already existed in the reverse direction. On a
recall result, `_entity_rows_for_units_sql` walks observation →
`source_memory_ids` → `unit_entities` of the source memories to surface
inherited entities. The `_create_memory_links` placeholder in
`consolidation/consolidator.py` documents this transitive design and is
explicit that observations don't carry direct `unit_entities` rows. The
entity-detail path simply never asked the inverse question — "given an
entity, which observations mention it?"
This patch:
- Replaces the stub list with a real query in `get_entity()`, dispatched
through a new `_observations_for_entity_sql()` helper.
- PG branch uses GIN-indexed array overlap on `memory_units.source_memory_ids`
(`a2b3c4d5e6f8_add_gin_index_source_memory_ids` already provides the
supporting index).
- Oracle branch joins through the `observation_sources` junction table
and uses `FETCH FIRST ... ROWS ONLY` per dialect convention.
- Both branches UNION a direct `unit_entities` path so the rare
observation that does carry its own entity row is not dropped.
- Adds `max_observations: int = 50` with a hard cap of 200 so a panel
render can't accidentally pull thousands for a hot entity.
- Ordered by `mentioned_at DESC NULLS LAST` so the freshest
observations surface first.
The HTTP layer at `api/http.py` already iterates `entity["observations"]`
and maps to `EntityObservationResponse(text, mentioned_at)` — no API
contract change.
Test coverage in `test_get_entity_observations.py`:
- Inherited observation (linked via `source_memory_ids`) surfaces.
- Direct observation (rare `unit_entities` row on the observation
itself) also surfaces.
- Observations unrelated to the entity do not surface.
- Ordering: most recent first.
- `max_observations` cap is honored end-to-end.
- Missing entity_id still returns None.
Closes the gap that turned the entity-detail endpoint into a no-op for
any consumer that listed observations under an entity.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
MemoryEngine.get_entity()returns"observations": []hardcoded, soGET /v1/default/banks/{bank_id}/entities/{entity_id}always reports zero observations regardless of how many actually reference the entity. The response schema (EntityDetailResponse.observations: list[EntityObservationResponse]) and the example inEntityObservation's model config both advertise the field. The deprecatedregenerate_entity_observationsendpoint at the same path reinforces that this field was once user-visible. The engine method simply never followed the traversal needed to populate it.The traversal already exists in the reverse direction.
_entity_rows_for_units_sql(memory_engine.py) walks observation →source_memory_ids→unit_entitiesof the source memories to inherit entities on recall results. The_create_memory_linksplaceholder inconsolidation/consolidator.pydocuments this transitive design explicitly:The entity-detail path simply never asked the inverse question — "given an entity, which observations mention it?" That's what this PR adds.
What changes
get_entity()with a real query, dispatched through a new_observations_for_entity_sql()helper that mirrors the dialect split of_entity_rows_for_units_sql.memory_units.source_memory_ids. The supporting index already exists (a2b3c4d5e6f8_add_gin_index_source_memory_ids).observation_sourcesjunction table, usesFETCH FIRST ... ROWS ONLYper the existing Oracle convention inops_oracle.py.unit_entitiespath so an observation that happens to carry its own entity row (rare, e.g. test fixtures, manual inserts) isn't dropped.max_observations: int = 50parameter, hard-capped at 200. Hot entities (thousands of mentions) can't make a single panel render pull unbounded rows.mentioned_at DESC NULLS LASTso the freshest material surfaces first.api/http.pyalready iteratesentity["observations"]and maps toEntityObservationResponse(text=obs.text, mentioned_at=obs.mentioned_at)— no API contract change, no OpenAPI regen needed.Cost model
Same query shape as the existing recall-side traversal, traversed in the opposite direction:
LIMIT 50, driven by top-N sort over the candidate observation set. Well within budget for a click-driven endpoint.source_memory_idsand the partialidx_memory_units_observation_datecarry the work.Test plan
New
tests/test_get_entity_observations.py:source_memory_idsinheritance surfaces under the entity.unit_entitiesrow surfaces too (UNION branch).mentioned_at DESC.max_observationscap is honored end-to-end.entity_idstill returnsNone.Local verification:
ruff checkclean.main).Notes
_create_memory_linksplaceholder remains intentionally a no-op — the design decision to avoid duplicating entity rows for observations is preserved. This PR only restores read-time access to the data that's already present in the graph.EntityDetailPanelenhancement inhindsight-control-plane/src/components/entities-view.tsxto render the now-populatedobservationsfield; left out of this PR to keep scope tight.