Local Knowledge Base (RAG) Template

This template provides a simple, local Retrieval-Augmented Generation (RAG) system for your Data Engineering documents. It ingests PDFs, Markdown, Text files, and Jupyter Notebooks into a local vector database (ChromaDB) for querying.

Features

Multi-format Support: Handles .pdf, .md, .txt, and .ipynb files.
Local Embeddings: Uses sentence-transformers/all-MiniLM-L6-v2 (runs entirely on your CPU/GPU, no API keys required).
Persistent Storage: Saves the vector database locally in ./chroma_db.
Easy Querying: Simple CLI to ask questions against your document set.

Setup

Install Dependencies:
```
pip install -r requirements.txt
```
Prepare Your Documents: Place your documents in a folder (e.g., data/my_docs).
Ingest Data: Run the ingestion script to process your documents and build the database.
```
python ingest.py --source_dir /path/to/your/documents
```

Query the Knowledge Base: Ask questions about your data.

python query.py "What are the best practices for dbt macros?"

Requirements

See requirements.txt.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
USECASE_GUIDE.md		USECASE_GUIDE.md
agent_chat.py		agent_chat.py
ingest.py		ingest.py
query.py		query.py
requirements.txt		requirements.txt
start.sh		start.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Local Knowledge Base (RAG) Template

Features

Setup

Requirements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Local Knowledge Base (RAG) Template

Features

Setup

Requirements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages