A lightweight, local Retrieval-Augmented Generation (RAG) web application that allows you to chat with your documents using local LLMs. Built with Flask, LangChain, ChromaDB, and Ollama.
- Local & Private: Your documents and queries never leave your machine.
- Multi-Session Support: Organise different topics into separate chat sessions.
- Document Management: Upload and index various file formats including:
- PDF (
.pdf) - Text (
.txt) - CSV (
.csv) - Markdown (
.md)
- PDF (
- Flexible Configuration:
- Memory: Adjust how many past interactions are included in the context.
- Collections: Group documents into different vector database collections.
- Model Selection: Switch between different local models available on Ollama.
- Smart Context: Automatically retrieves relevant chunks from your documents to answer questions.
- Interactive UI: Clean, responsive interface with Markdown support and source citations.
- Fun Empty State: Displays random interesting Wikipedia articles while you wait or before you start.
- Python 3.8+
- Ollama: Download and install from ollama.com.
- Required Models:
- For answering:
ollama pull deepseek-r1:1.5b(or your preferred model) - For embeddings:
ollama pull nomic-embed-text
- For answering:
-
Clone the repository:
git clone <repository-url> cd RAG-web
-
Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
- Start Ollama: Ensure the Ollama service is running on your machine.
- Run the application:
Or use the provided batch file (Windows):
python app.py
run.bat
- Open in Browser: Navigate to
http://127.0.0.1:5000.
- Create a Session: Click the
+button in the sidebar to start a new chat. - Upload Documents: Click the paperclip icon, select your files, and they will be indexed into the current collection.
- Configure:
- Context Memory: Set how many previous message pairs to remember.
- Collection: Name your vector store (e.g., "LegalDocs", "Manuals").
- Model: Specify the Ollama model name you want to use.
- Ask Questions: Type your query in the input bar. The assistant will search your documents and provide an answer with citations.
RAG-web/
├── app.py # Main Flask application & API routes
├── utils.py # RAG logic, document loaders, & embedding helpers
├── requirements.txt # Python dependencies
├── sessions.json # Persisted chat sessions (generated)
├── chroma/ # Local vector database storage (generated)
├── static/
│ ├── app.js # Frontend logic (Vanilla JS)
│ └── style.css # Styling
└── templates/
└── index.html # Main UI template
Feel free to fork, open issues, and submit PRs
MIT License.