Index a codebase into a SQLite property graph and ChromaDB embeddings, with a FastMCP server for agents.
From any project root (including this repo after pip install -e ".[local-embed]" or pip install -e .):
cd /path/to/repo
pip install -e ".[local-embed]" # optional: local embeddings (sentence-transformers)
# Index (pick one embedding mode):
code-indexer index --root . --embedding-provider local --embedding-model all-MiniLM-L6-v2
# or, with API embeddings:
# export OPENAI_API_KEY=...
# code-indexer index --root . --embedding-provider api --embedding-model text-embedding-3-small
code-indexer explore --root . # human-readable summary of graph + chroma paths
code-indexer status --root . # manifest JSON (file hashes, last commit if git)
code-indexer search "authentication middleware" --root .
code-indexer mcp --root . # stdio MCP server (sets CODE_INDEXER_ROOT for this process)Artifacts are written to .code-indexer/ under --root (ignored by git in this project).
OPENAI_API_KEY— for API embeddings (--embedding-provider api) and LLM summaries (--llm).
.code-indexer/ under the repo root (default):
graph.sqlite— nodes and edgeschroma/— persistent Chroma datamanifest.json— file hashes for incremental indexconfig.json— resolved config snapshot from last index (API keys redacted)
Point CODE_INDEXER_ROOT at the repo you indexed, then run:
export CODE_INDEXER_ROOT=/path/to/repo
code-indexer mcpTools include code_search, similar_tests, get_symbol, find_symbols, outline_file, outline_component, get_callers, get_callees, trace_flow, index_status, and index_refresh.
pip install -e ".[dev,local-embed]"
pytest tests/ -q # includes integration test (indexes repo + MCP stdio)
pytest tests/ -q -m "not integration" # skip slow MCP integrationEverything lives under your artifact directory (default .code-indexer/ next to the repo root).
-
CLI summary — counts, file paths, and sample nodes/edges:
code-indexer explore --root /path/to/repo code-indexer explore --root . --json # full JSON (includes Chroma `peek`) code-indexer explore --root . --export-graph graph.json
-
SQLite graph — open
graph.sqlitein DB Browser for SQLite or thesqlite3shell. Tables:nodes,edges. Example:sqlite3 .code-indexer/graph.sqlite "SELECT kind, COUNT(*) FROM nodes GROUP BY kind;" sqlite3 .code-indexer/graph.sqlite "SELECT id, kind, path, name FROM nodes LIMIT 20;"
-
Chroma vectors — folder
chroma/is Chroma’s on-disk store; collection name isrepo_id__embedding_model_id(seeexploreoutput). Usecode-indexer explore --jsonto seepeeksample rows, or query from Python withchromadb.PersistentClient(path=".../chroma"). -
Graph visualization — export JSON and convert to Graphviz/D3/etc.:
code-indexer explore --root . --export-graph /tmp/graph.jsonThe file has
nodesandedgeswith ids you can feed to graph layout tools.
See TICKETS.md for remaining plan items (hybrid search, git diff optimization, flow summaries, etc.) and vectorized LLM summaries for semantic search over plain-language descriptions.