RAGit is a zvec + git bound RAG CLI that runs inside your project repository.
It collects, analyzes, and retrieves documents produced during AI agent workflows, then version-controls snapshots bound to commit SHAs.
RAGit is a local-first RAG CLI that turns AI agent project documents and context into commit-bound, reusable knowledge inside the repository.
RAGit is not a giant transcript archive. It is an agent-first collaboration memory system that preserves the smallest reusable state needed to resume work at a given commit: goal, constraints, stable decisions, open loops, and next actions. By separating active working memory from durable searchable memory, it helps the next agent recover momentum without replaying the entire past.
The runtime structure below shows how ragit connects the CLI, command layer, core services, git-bound snapshots, and local storage.
ββββββββββββββ
βUser / Agentβ
ββββββββββββββ€
ββββββββββββββ
|
|
βββββββββββ
βragit CLIβ
βββββββββββ€
βββββββββββ
|
ββββββββββββββββββββββββββββββ
βCommand Layer β
ββββββββββββββββββββββββββββββ€
βinit β
βingest β
βquery β
βcontext pack β
βmemory β
βsession / artifact / harnessβ
ββββββββββββββββββββββββββββββ
|
βββββββββββββββββββββ
βCore Services β
βββββββββββββββββββ βββββββββββββββββββββ€
βGit commit / HEADβ βdoc authority β
βββββββββββββββββββ€ βmanifest β
βsnapshot binding β βretrieval β
βββββββββββββββββββ βmemory β
| βartifacts / harnessβ
| βββββββββββββββββββββ
|
ββββββββββββββββββββββ βββββββββββββββ
β.ragit control planeβ ββββββββββββββ βOutputs β
ββββββββββββββββββββββ€ βββββββββ β.ragit/storeβ βββββββββββββββ€
βconfig β βdocs/**β ββββββββββββββ€ βquery hits β
βmanifest β βββββββββ€ βdocuments β βcontext pack β
βmemory β βββββββββ βchunks β βrecall packetβ
βartifacts β ββββββββββββββ βββββββββββββββ
ββββββββββββββββββββββ
ragit CLIis the single entrypoint. Every user or agent workflow starts by dispatching a command through the command layer.Git commit / HEADbinds manifest selection, so retrieval and recall stay reproducible at a specific repository state..ragit control planestores configuration and tracked knowledge state, while.ragit/storeholds the local vector index fordocumentsandchunks.- User-facing outputs are produced from the same runtime core:
query hits,context pack, andrecall packet.
Git version-controls source code states. RAGit version-controls AI-working knowledge states bound to the same commit history.
sequenceDiagram
participant Developer
participant Git
participant Repository
participant RAGit
participant Store as ".ragit Store"
participant Agent
Developer->>Git: stage and commit code/docs
Git->>Repository: write commit snapshot
Note over Git,Repository: Git manages code and file history
Git-->>RAGit: trigger post-commit / post-merge hook
RAGit->>Repository: detect changed documents since SHA
RAGit->>Store: chunk, index, and write manifest bound to commit SHA
Note over RAGit,Store: RAGit manages document knowledge and agent context history
Agent->>RAGit: query or context pack at HEAD / specific SHA
RAGit->>Store: load snapshot + retrieval data
RAGit-->>Agent: return commit-bound knowledge/context
- Git answers: "What did the repository look like at this commit?"
- RAGit answers: "What knowledge and context should an agent use at this commit?"
- Together they make code state and AI context state reproducible.
- Preserve project context across AI agent work
- Reproduce knowledge at a specific commit state
- Turn structured docs into agent-ready inputs
- Automate indexing without adding workflow friction
RAGit protects knowledge state, not just files.
- Write paths sanitize before persistence, so transcripts, memory state, artifacts, harness runs, and durable docs do not keep raw-looking secrets by default.
- Admission control runs before persistence on knowledge-writing paths. In
security.admission_mode=enforce, high-risk payloads are blocked or replaced with a sentinel before they can become persisted knowledge state; legacy repos without this key fall back toreport-only. - Retrieval-facing commands re-mask again before printing or JSON projection, so
query,context pack,memory recall,log,timeline, andharness packdo not echo raw secret material back to the user. - Remote embedding egress is policy-controlled.
security.remote_embedding_policy=allow-sanitizedallows only sanitized query text and durable-doc ingest text to leave the repository;local-onlyblocks remote egress entirely. ragit security auditinspects control-plane/store/docs/provider posture and admission findings, whileragit security purgesanitizes or clears local state without rewriting repo-tracked documents.
Architecture Decision (ADR): durable decision record with rationale and consequencesProduct Requirement (PRD): product problem, users, goals, and success criteriaSoftware Requirements (SRS): system-level functional and non-functional requirementsImplementation Specification (SPEC): implementation-level functional requirements and interface contractsPlan: execution sequencing, milestones, and work breakdownDomain-Driven Design (DDD): bounded contexts, aggregates, and domain structureGlossary: shared vocabulary for stable project termsPhase and Binding Documents (PBD): phase and binding topology for understanding implementation structure and coupling
RAGit does not add SAD, HLD, or LLD as new canonical document types.
Instead, it treats them as external architecture views layered on top of the existing document system.
SAD: repository or system-wide architecture explanation, usually read across architecture overviews plus relatedADRdocumentsHLD: higher-level module boundaries, data flow, and topology, usually expressed withSRS,DDD, andPBDLLD: implementation-unit contracts, interfaces, and state details, usually expressed withSPEC
When authors want to make that view explicit, they can add an optional frontmatter hint:
---
type: spec
architecture_view: lld
---architecture_view is advisory only.
RAGit still classifies, validates, ingests, and retrieves documents by canonical type.
Requirements:
- Node.js
20.19.0or newer - pnpm
10.13.1or newer
For repository-local development:
pnpm install
pnpm ragit --helpInside this repository checkout, run CLI commands with pnpm ragit <command>.
For the published CLI:
npm install -g ragit
pnpm add -g ragit
bun add -g ragit
npx ragit --helpWhen the package is installed globally, use ragit <command>.
pnpm build is optional for repository-local usage.
Run it only when you need to generate dist/ artifacts or verify the packaged CLI entrypoint.
pnpm build- Primary URL (English):
https://rhiokim.github.io/ragit/en/ - Korean URL:
https://rhiokim.github.io/ragit/ko/ - English is the source of truth, and Korean is provided in the same structure.
- New project onboarding starts at
https://rhiokim.github.io/ragit/en/docs/getting-started/andhttps://rhiokim.github.io/ragit/ko/docs/getting-started/.
Run locally:
pnpm docs:devBuild static output and preview:
pnpm docs:check:i18n
pnpm docs:build
pnpm docs:serveDeployment:
- GitHub Actions deploys automatically to
gh-pageswhenmainis pushed. - For manual redeploy, run
docs-gh-pagesviaworkflow_dispatch. - In Repository Settings > Pages, set Source to
GitHub Actions.
publish.ymlvalidates tags againstpackage.json.versionand publishes only onvX.Y.Ztag pushes.workflow_dispatchruns the same release checks without publishing, so you can rehearse the pipeline before the first release.- Before enabling automatic publish, configure npm Trusted Publishing for
rhiokim/ragitand the GitHub Actions workflow.
Release validation flow:
pnpm release:check
VERSION=$(node -p 'require("./package.json").version')
git tag "v${VERSION}"
git push origin --tagsThe README shows the canonical first-use workflow for RAGit. Use Getting Started for project onboarding, Commands for the full command map, and Agent CLI Contract for machine-safe integration rules.
init prepares the repository, but it does not make the repo search-ready.
Retrieval starts only after ingest writes snapshot-backed knowledge state.
pnpm ragit init
pnpm ragit ingest --all
pnpm ragit status --format json
pnpm ragit query "project goal" --view minimal --format bothquery returns raw retrieval hits from indexed knowledge at a snapshot.
pnpm ragit query "DDD bounded context principles" --view minimal --format bothcontext pack turns retrieval hits into a budgeted handoff packet for the next agent step.
pnpm ragit context pack "Implementation plan for this sprint" --budget 1200 --view minimal --format bothmemory recall rebuilds a resume packet by layering working state on top of retrieval.
pnpm ragit memory recall "resume auth flow" --view minimal --format bothUse describe as the first step when wiring RAGit into an agent workflow.
Install managed hooks only after the first successful ingest if you want automatic post-commit or post-merge indexing.
pnpm ragit describe query --format json
pnpm ragit hooks install --dry-run --format jsonUse these commands after the happy path when you need history, trust checks, recovery views, or safe remediation planning.
pnpm ragit log --max-count 5 --view default --format both
pnpm ragit narrative --format both
pnpm ragit drift --scope all --view default --format both
pnpm ragit repair --scope all --format json
pnpm ragit security audit --format jsonnarrative writes a self-contained HTML recovery report from snapshots, artifacts, and events.
Use --emit-model only when you want the isolated OpenTUI explorer under tools/narrative-tui; the HTML report remains the canonical artifact.
For Recovery View details, freshness and validation axes, and viewer boundaries, see the narrative command docs.
These commands are not part of the first-use path. Use them for configuration, deeper diagnosis, purge/remediation work, or legacy store migration.
pnpm ragit config set retrieval.top_k 8
pnpm ragit doctor --format json
pnpm ragit security purge --target control-plane --dry-run --format json
pnpm ragit migrate from-json-store --dry-run
pnpm ragit migrate from-sqlitevss --dry-runThe flow below shows how ragit ingest turns repository documents and bound artifacts into a searchable snapshot.
βββ
β"β
ββ¬β
ββΌβ βββββββββββ
β βββββββ ββββββββ ββββββ βSession /β βββββββββ
ββ΄β βragitβ βrun β βRepoβ βHarness β β.ragit/β ββββββββββ
User / βCLI β βIngestβ βdocsβ βartifactsβ βstore β βManifestβ
Agent ββββ¬βββ ββββ¬ββββ βββ¬βββ ββββββ¬βββββ βββββ¬ββββ βββββ¬βββββ
β ragit ingest ... β β β β β β
β βββββββββββββββββββββββββββ> β β β β β
β β β β β β β
β β parse mode β β β β β
β β + source options β β β β β
β β ββββββββββββββββββββββ> β β β β
β β β β β β β
β β ββββββ β β β β
β β β β ensure .ragit β β β β
β β β<ββββ load config β β β β
β β β check HEAD β β β β
β β β β β β β
β β β β β β β
β β β resolve candidates β β β β
β β β ββββββββββββββββββββββββββ> β β β
β β β β β β β
β β β β β β β
β β βββββββββ€βββββͺββββββββββββββββββββββββββββͺβββββββββββββ β β β
β β β LOOP β each supported doc β β β β β
β β βββββββββ β β β β β β
β β β β hash -> mask -> detect β β β β β
β β β β validate -> chunk -> embedβ β β β β
β β β β ββββββββββββββββββββββββββ> β β β β
β β ββββββββββββββͺββββββββββββββββββββββββββββͺβββββββββββββ β β β
β β β β β β β
β β β β β β β
β ββββββββ€ββββββͺββββββββββββββββββββββββͺββββββββββββββββββββββββββββͺβββββββββββββββββββͺββββββββββββββββββββͺβββββββββββββββββββͺββββββββββββββ
β β ALT β --dry-run β β β β β β
β ββββββββ β β β β β β β
β β β return planned summaryβ β β β β β
β β β <β β β β β β β β β β β β β β β β
β β βββββββββββββͺββββββββββββββββββββββββͺββββββββββββββββββββββββββββͺβββββββββββββββββββͺββββββββββββββββββββͺβββββββββββββββββββͺββββββββββββββ£
β β [apply] β β β β β β β
β β β β bind pending artifacts β β β β
β β β β + build artifact chunks β β β β
β β β β ββββββββββββββββββββββββββββββββββββββββββββ>β β β β
β β β β β β β β β
β β β β write docs + chunks β β β β
β β β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ>β β β
β β β β β β β β β
β β β β β build + write snapshot β β β
β β β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ> β
β β β β β β β β β
β β β β ingest summary β β β β
β β β <β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β
β ββββββββββββββͺββββββββββββββββββββββββͺββββββββββββββββββββββββββββͺβββββββββββββββββββͺββββββββββββββββββββͺβββββββββββββββββββͺββββββββββββββ
β β β β β β β
β searchable snapshot summaryβ β β β β β
β <β β β β β β β β β β β β β β β β β β β
β β β β β β β
β β β β β β β
- Candidate resolution changes by mode: explicit
--path, glob-style--files, incremental--since, or the default full-snapshot scan. --dry-runstops before writing.ragit/storeor a new manifest and only returns the planned ingest summary.- The apply path is where pending artifact binding, artifact chunk construction, and store/manifest writes actually happen.
- The final searchable truth comes from the manifest snapshot, not from raw files or chunks alone.
.ragit/
config.toml
docs/index.json
guide/guide-index.json
guide/templates/
log/
manifest/<commit-sha>.json
reports/
security/
memory/sessions/
memory/working/
artifacts/session/
artifacts/harness/
store/meta.json
store/documents/
store/chunks/
cache/
hooks/
docs/
memory/
decisions/
glossary/
plans/
Git tracking policy:
| Category | Paths | Default |
|---|---|---|
| Project contract | .ragit/config.toml, .ragit/guide/**, .ragit/docs/index.json, AGENTS.md, RAGIT.md, durable docs under docs/** |
Track |
| Local runtime state | .ragit/store/**, .ragit/cache/**, .ragit/log/**, .ragit/reports/**, .ragit/security/**, .ragit/memory/sessions/**, .ragit/memory/working/**, .ragit/artifacts/session/** |
Ignore |
| Optional snapshot history | .ragit/manifest/** |
Ignore in safe; track in snapshot-history or dogfood |
| Optional reviewed harness assets | .ragit/artifacts/harness/** |
Ignore in safe and snapshot-history; track in dogfood |
For a normal product repository, accept the safe policy. For a repository that reviews RAGit snapshot history, keep manifests tracked. For a dogfooding/testbed repository, keep both manifests and reviewed harness artifacts tracked.
memory wrap: save a session summary into.ragit/memory/sessions/and refresh working state in.ragit/memory/working/memory recall: combine working state and snapshot-scoped retrieval into an agent-ready recall packetmemory promote: crystallize promotion candidates into searchable long-term docs underdocs/memory/**and ingest them immediately whenHEADexists
This split is intentional:
.ragit/memory/**is the local control plane for working state and session history; promote durable knowledge intodocs/memory/**when it should be reviewed and trackeddocs/memory/**is the searchable long-term memory corpus that participates in normal ingest/query flows
- Prefer
--format jsonfor machine consumers. - Use
ragit describe <command> --format jsonbefore integrating a command for the first time. - Prefer
--view minimalforquery,context pack, andmemory recall. - Prefer
--input <path|->for structured agent payloads. - Run mutating commands with
--dry-runfirst:ingest,hooks install,hooks uninstall,memory wrap,memory promote.
- Repository-managed source:
skills/use-ragit - Codex install target:
${CODEX_HOME:-$HOME/.codex}/skills/use-ragitvia copy or symlink - Shared agent-neutral references for Claude and Gemini:
skills/use-ragit/references/
pnpm ragit init is now a discover-first bootstrap command.
It still prepares .ragit/**, AGENTS.md, guide assets, and the local zvec store, but it does that only after it inspects the repository and decides what knowledge already exists.
Default flow:
- Check Git environment (and optionally run
git init) - Scan repository code/docs/build files
- Select
empty,existing,docs-heavy, ormonorepo - Compute documentation coverage, maturity, and knowledge-slot mapping
- Reuse existing repository docs first and plan missing foundational docs
- Write stage-1 draft docs plus
.ragit/** - Choose the
.gitignorepolicy for RAGit runtime data - Bootstrap the zvec canonical store
- Print the final summary and next actions
What init prepares:
- Git-aware repository normalization
- Existing-doc discovery and coverage evaluation
- Stage-1 foundational drafts when missing:
RAGIT.mddocs/workspace-map.mddocs/ragit/ingestion-policy.mddocs/known-gaps.mddocs/adr/README.md
.ragit/config.toml,.ragit/guide/templates/*, and.ragit/guide/guide-index.json.gitignoreentries for local-only RAGit runtime state, with interactive choices for manifest and harness artifact tracking- Empty zvec collections under
.ragit/store/ - Next-action guidance for
hooks installandingest
What init does not prepare:
- No searchable corpus, chunk records, or manifests
- No zvec document/chunk upsert
- No query-ready knowledge state during
init
In other words, init makes the repository diagnosed, foundation-ready, and zvec-store-ready, not search-ready.
storage.backend = "zvec" still means the canonical backend, and searchable knowledge still begins only after pnpm ragit ingest ... runs.
Supported options:
pnpm ragit init --mode auto --strategy balanced --merge-existing
pnpm ragit init --yes # non-interactive with defaults
pnpm ragit init --non-interactive # alias of --yes
pnpm ragit init --git-init # allow git init in non-interactive mode
pnpm ragit init --dry-run --output json
pnpm ragit init --output json # JSON summary output--cwdmay point to the repository root or any nested path inside the worktree;initnormalizes to the Git root before writing.ragitorAGENTS.md.--modeoverrides repository-mode detection.--strategycontrols how aggressively stage-1 draft docs are generated.--dry-runcomputes the full analysis report without writing files or bootstrapping storage.- zvec bootstrap currently supports
darwin/arm64,linux/arm64, andlinux/x64.
Recommended flow after init:
pnpm ragit migrate from-json-store # only if summary says migrationRequired=true
pnpm ragit hooks install
pnpm ragit ingest --allpost-commit: automatically indexes changes fromHEAD~1..HEADpost-merge: automatically indexes changes from${ORIG_HEAD:-HEAD~1}..HEAD- Failures are warning-only and do not block commit/merge flows.
- 1st pass: zvec vector search scoped to the snapshot manifest
- 2nd pass: keyword score
- Final score:
alpha * vector + (1-alpha) * keyword(defaultalpha=0.7)
- Secret masking is enabled by default during ingestion (
security.secret_masking=true) - OpenAI/GitHub/AWS keys and
api_key/token/secretpatterns are masked.
RAGit is licensed under Apache-2.0. The root LICENSE file is the single source of truth for license terms across this repository.
pnpm test