LightRAG + ArangoDB: Multi-Hop Reasoning with Knowledge Graphs

A complete implementation of LightRAG using ArangoDB as the storage backend, demonstrating how knowledge graphs enable multi-hop reasoning that traditional RAG systems cannot achieve.

🎯 What This Project Demonstrates

Traditional RAG retrieves documents based on vector similarity alone. It struggles with questions that require connecting information across multiple documents.

LightRAG + ArangoDB builds a knowledge graph from your documents, enabling:

Multi-hop reasoning: Answer questions that require traversing relationships between entities
Better context retrieval: Find relevant information through graph connections, not just semantic similarity
Unified storage: Documents, embeddings, and relationships all stored in ArangoDB

🚀 Quick Start

Prerequisites

Docker & Docker Compose
OpenAI API key (for embeddings and LLM)

1. Clone and Configure

git clone https://github.com/YOUR_USERNAME/lightrag-arangodb.git
cd lightrag-arangodb

# Create environment file
cp .env.example .env

# Add your OpenAI API key to .env
nano .env

2. Start Services

# Start ArangoDB and the LightRAG container
docker compose up -d --build

# Create the multihop database for the demo
docker exec arangodb-single arangosh \
  --server.password openSesame \
  --javascript.execute-string "db._createDatabase('multihop');"

3. Run the Multi-Hop Demo

docker exec lightrag-app python multi_hop_demo.py

4. Re-run Queries Without Ingestion (Optional)

After running the full demo once, you can re-run queries against the existing database without re-ingesting:

# Copy query script to container
docker cp query_multihop.py lightrag-app:/app/

# Run queries only (saves results to JSON)
docker exec lightrag-app python query_multihop.py

# Get the results file
docker cp lightrag-app:/app/multihop_results.json ./

📊 Understanding Multi-Hop Reasoning

The multi_hop_demo.py uses completely fabricated data about a fictional company called "Nexova Technologies" to prove that answers come from the knowledge graph—not the LLM's training data.

Example: Information Spread Across Documents

Document 1: "PJ Kowalski is the lead architect at Nexova Technologies. He reports directly to Daniel Chen."

Document 2: "Daniel Chen is the VP of Engineering. His office is on the 4th floor."

Query: "What floor is the office of the person who hired PJ Kowalski?"

Reasoning Path: PJ Kowalski → reports to Daniel Chen → office on 4th floor

A traditional vector search might not connect these documents, but the knowledge graph traverses:

Find "PJ Kowalski" entity
Follow relationship to "Daniel Chen" (reports to)
Find "4th floor" connected to Daniel Chen's office

📁 Project Structure

├── arangodb_impl.py      # ArangoDB storage implementations for LightRAG
├── demo.py               # Basic demo with sample documents
├── multi_hop_demo.py     # Multi-hop reasoning demonstration (ingest + query)
├── query_multihop.py     # Query-only script (no ingestion, saves results to JSON)
├── docker-compose.yml    # Docker services configuration
├── Dockerfile            # LightRAG container build
├── requirements.txt      # Python dependencies
└── .env.example          # Environment template

📊 Results Output

The query_multihop.py script saves detailed results to multihop_results.json:

{
  "timestamp": "2026-02-25T...",
  "summary": {
    "naive_correct": 3,
    "hybrid_correct": 7,
    "total_queries": 8,
    "winner": "hybrid"
  },
  "queries": [
    {
      "query": "What color is the car driven by the manager of employee NX-4472?",
      "expected": "red",
      "reasoning": "NX-4472 = PJ → PJ's manager = Daniel → Daniel's car = cherry red",
      "hops": 3,
      "naive": { "response": "...", "correct": false },
      "hybrid": { "response": "...", "correct": true }
    }
  ]
}

🔧 Storage Implementations

This project implements all four LightRAG storage backends for ArangoDB:

Storage Type	Class	Purpose
Graph	`ArangoDBStorage`	Entity nodes and relationship edges
KV	`ArangoDBKVStorage`	Document chunks and metadata
Vector	`ArangoDBVectorStorage`	Embeddings for semantic search
Doc Status	`ArangoDBDocStatusStorage`	Document processing tracking

🔍 Query Modes

LightRAG supports different query modes:

from lightrag import LightRAG, QueryParam

# Vector-only search (like traditional RAG)
result = await rag.aquery(query, param=QueryParam(mode="naive"))

# Knowledge graph + vector search (multi-hop reasoning)
result = await rag.aquery(query, param=QueryParam(mode="hybrid", enable_rerank=False))

🌐 Accessing ArangoDB

Once running, access the ArangoDB web interface:

URL: http://localhost:8530
Username: root
Password: openSesame
Database: multihop

Explore the knowledge graph visually to see entities and their relationships!

📝 Configuration

Environment Variables

# Required
OPENAI_API_KEY=sk-your-key-here

# ArangoDB (defaults work with docker-compose)
ARANGO_HOST=http://arangodb:8529
ARANGO_USERNAME=root
ARANGO_PASSWORD=openSesame
ARANGO_DATABASE=multihop

📚 Learn More

LightRAG Paper: LightRAG: Simple and Fast Retrieval-Augmented Generation
LightRAG GitHub: HKUDS/LightRAG
ArangoDB Docs: arangodb.com/docs

📄 License

MIT License - feel free to use this for your own projects!

Built with ❤️ using LightRAG and ArangoDB

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LightRAG + ArangoDB: Multi-Hop Reasoning with Knowledge Graphs

🎯 What This Project Demonstrates

🚀 Quick Start

Prerequisites

1. Clone and Configure

2. Start Services

3. Run the Multi-Hop Demo

4. Re-run Queries Without Ingestion (Optional)

📊 Understanding Multi-Hop Reasoning

Example: Information Spread Across Documents

📁 Project Structure

📊 Results Output

🔧 Storage Implementations

🔍 Query Modes

🌐 Accessing ArangoDB

📝 Configuration

Environment Variables

📚 Learn More

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Dockerfile		Dockerfile
README.md		README.md
arangodb_impl.py		arangodb_impl.py
demo.py		demo.py
docker-compose.yml		docker-compose.yml
multi_hop_demo.py		multi_hop_demo.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

LightRAG + ArangoDB: Multi-Hop Reasoning with Knowledge Graphs

🎯 What This Project Demonstrates

🚀 Quick Start

Prerequisites

1. Clone and Configure

2. Start Services

3. Run the Multi-Hop Demo

4. Re-run Queries Without Ingestion (Optional)

📊 Understanding Multi-Hop Reasoning

Example: Information Spread Across Documents

📁 Project Structure

📊 Results Output

🔧 Storage Implementations

🔍 Query Modes

🌐 Accessing ArangoDB

📝 Configuration

Environment Variables

📚 Learn More

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages