Skip to content

remove custom-metadata.md; canonical path is vdbs + notebooks#2195

Open
kheiss-uwzoo wants to merge 12 commits into
NVIDIA:mainfrom
kheiss-uwzoo:docs/consolidate-custom-metadata-into-vdbs
Open

remove custom-metadata.md; canonical path is vdbs + notebooks#2195
kheiss-uwzoo wants to merge 12 commits into
NVIDIA:mainfrom
kheiss-uwzoo:docs/consolidate-custom-metadata-into-vdbs

Conversation

@kheiss-uwzoo

Copy link
Copy Markdown
Collaborator

Summary

  • Remove docs/docs/extraction/custom-metadata.md (duplicate of notebook + VDB README content added in align metadata docs with VDB filtering guide #2108).
  • Expand Metadata and filtering on �dbs.md with a short overview and links to the worked notebooks.
  • Drop the separate MkDocs nav entry; add redirect custom-metadata.md -> �dbs.md#metadata-and-filtering.
  • Update cross-links and doc-snippet test registry.

Follows Julio's NVBugs 6205401 guidance: VDB/metadata facts live on �dbs.md; runnable walkthroughs stay in notebooks.

Notebooks (canonical examples)

Operator/API reference remains in nemo_retriever/src/nemo_retriever/vdb/README.md.

Test plan

  • MkDocs build; confirm redirect from old custom-metadata URL
  • Nav no longer lists a separate Custom metadata page
  • Notebook links resolve on GitHub

@kheiss-uwzoo kheiss-uwzoo requested review from a team as code owners June 1, 2026 18:03
@kheiss-uwzoo kheiss-uwzoo requested a review from edknv June 1, 2026 18:03
@greptile-apps

greptile-apps Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR removes two redundant documentation files (custom-metadata.md and integrations-langchain-llamaindex-haystack.md), consolidates metadata content into the existing vdbs.md page, and adds MkDocs redirects so existing bookmarks and links continue to resolve.

  • Consolidated metadata docs: The #metadata-and-filtering section in vdbs.md is expanded with the sidecar parameter overview, retriever service flow, and links to both canonical notebooks, replacing the standalone page.
  • Updated cross-links: All in-page references to the two deleted files are rewritten to point at the relevant sections on vdbs.md or notebooks/index.md across five files.
  • Test registry updated: _PUBLIC_RETRIEVER_DOCS in the documentation snippet test swaps the deleted custom-metadata.md for vdbs.md, so Python blocks in the surviving page remain validated for unsupported constructor kwargs.

Confidence Score: 5/5

Safe to merge — purely a documentation restructure with no code logic changes.

All changes are documentation and test-registry updates. Both previously flagged issues (bare metadata section and missing notebook link) are resolved in this revision. The MkDocs redirects correctly point deleted URLs at the right destinations, the expanded metadata section in vdbs.md accurately describes the sidecar parameters and service flow, and the Python snippet test continues to cover vdbs.md. No source code is modified.

No files require special attention; the one minor gap is that metadata_and_filtered_search.ipynb is not added to the documentation snippet test registry alongside the other metadata notebook.

Important Files Changed

Filename Overview
docs/docs/extraction/vdbs.md Metadata and filtering section expanded with sidecar parameter prose, retriever service flow, and notebook links — addresses previously flagged bare-link issue.
docs/docs/extraction/notebooks/index.md Both canonical metadata notebooks now listed: metadata_and_filtered_search.ipynb and nemo_retriever_retriever_query_metadata_filter.ipynb — addresses previously flagged missing link.
docs/mkdocs.yml Nav sections renumbered after removing two entries; redirects added for both deleted pages including fragment anchor target for custom-metadata.md.
nemo_retriever/tests/test_src_documentation_snippets.py _PUBLIC_RETRIEVER_DOCS updated to reference vdbs.md instead of the deleted custom-metadata.md; existing Python block in vdbs.md is valid and uses no unsupported Retriever kwargs.
docs/docs/extraction/custom-metadata.md Deleted; content migrated to vdbs.md#metadata-and-filtering and canonical runnable examples remain in the two linked notebooks.
docs/docs/extraction/integrations-langchain-llamaindex-haystack.md Deleted; LangChain and LlamaIndex notebook links survive in notebooks/index.md; MkDocs redirect added to notebooks/index.md.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["custom-metadata.md\n(deleted)"] -->|"301 redirect"| B["vdbs.md#metadata-and-filtering"]
    C["integrations-langchain-llamaindex-haystack.md\n(deleted)"] -->|"301 redirect"| D["notebooks/index.md"]

    B --> E["sidecar params overview\nmeta_dataframe / meta_source_field / meta_fields"]
    B --> F["nemo_retriever_retriever_query_metadata_filter.ipynb\n(worked example)"]
    B --> G["VDB README\n(operator reference)"]

    D --> H["metadata_and_filtered_search.ipynb"]
    D --> F
    D --> I["LangChain / LlamaIndex notebooks"]

    J["workflow-agentic-retrieval.md\nagentic-retrieval-concept.md\ndeployment-options.md\noverview.md\nworkflow-e2e-blueprints.md"] -->|"cross-links updated"| B
    J -->|"cross-links updated"| D
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A["custom-metadata.md\n(deleted)"] -->|"301 redirect"| B["vdbs.md#metadata-and-filtering"]
    C["integrations-langchain-llamaindex-haystack.md\n(deleted)"] -->|"301 redirect"| D["notebooks/index.md"]

    B --> E["sidecar params overview\nmeta_dataframe / meta_source_field / meta_fields"]
    B --> F["nemo_retriever_retriever_query_metadata_filter.ipynb\n(worked example)"]
    B --> G["VDB README\n(operator reference)"]

    D --> H["metadata_and_filtered_search.ipynb"]
    D --> F
    D --> I["LangChain / LlamaIndex notebooks"]

    J["workflow-agentic-retrieval.md\nagentic-retrieval-concept.md\ndeployment-options.md\noverview.md\nworkflow-e2e-blueprints.md"] -->|"cross-links updated"| B
    J -->|"cross-links updated"| D
Loading

Reviews (18): Last reviewed commit: "Merge branch 'main' into docs/consolidat..." | Re-trigger Greptile

@kheiss-uwzoo kheiss-uwzoo changed the title docs(extraction): remove custom-metadata.md; canonical path is vdbs + notebooks remove custom-metadata.md; canonical path is vdbs + notebooks Jun 1, 2026
@kheiss-uwzoo kheiss-uwzoo requested a review from jperez999 June 1, 2026 18:12
@kheiss-uwzoo kheiss-uwzoo added the doc Improvements or additions to documentation label Jun 1, 2026
Comment thread docs/docs/extraction/integrations-langchain-llamaindex-haystack.md Outdated
Comment thread docs/docs/extraction/vdbs.md
Comment thread docs/docs/extraction/vdbs.md Outdated
Comment thread docs/docs/extraction/notebooks/index.md
@kheiss-uwzoo kheiss-uwzoo requested a review from randerzander June 5, 2026 17:42
@kheiss-uwzoo kheiss-uwzoo marked this pull request as draft June 5, 2026 23:27
@kheiss-uwzoo kheiss-uwzoo marked this pull request as ready for review June 8, 2026 20:14
Comment thread docs/docs/extraction/notebooks/index.md Outdated
Comment thread docs/docs/extraction/agentic-retrieval-concept.md Outdated
Comment thread docs/docs/extraction/workflow-e2e-blueprints.md Outdated
@kheiss-uwzoo kheiss-uwzoo requested a review from jperez999 June 11, 2026 15:32
@kheiss-uwzoo kheiss-uwzoo self-assigned this Jun 11, 2026
kheiss-uwzoo and others added 9 commits June 11, 2026 16:34
Drop dead metadata_and_filtered_search notebook links; document retriever
service sidecar upload on vdbs.md instead of delegating to VDB README.
Delete integrations-langchain-llamaindex-haystack.md, point inbound links at notebooks/index.md, and add a mkdocs redirect.
Replace duplicated metadata prose with a single notebook link per review.
Revert doc-snippet test list change; belongs outside this docs-only PR.
Users arriving via the deleted custom-metadata.md URL need a short
overview of meta_* sidecar params and filter modes, plus links to the
worked notebooks and VDB README—not a bare hyperlink alone.
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Restore vdbs.md metadata landing content with service sidecar guidance, drop dead metadata_and_filtered_search.ipynb links, and point the doc-snippet test registry at vdbs.md instead of deleted custom-metadata.md.
Remove reindex_example.ipynb entry (notebook removed on main in NVIDIA#2163).
Rename framework cross-links to Starter kits to match mkdocs nav label.
@kheiss-uwzoo kheiss-uwzoo force-pushed the docs/consolidate-custom-metadata-into-vdbs branch from a541334 to fafaf61 Compare June 11, 2026 23:35
Resolve modify/delete conflict on custom-metadata.md by keeping the PR
deletion; canonical metadata docs live in vdbs.md with mkdocs redirect.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants