Skip to content

[Code-Review Graph (CRG) — follow-ups (after 2026-05-04 stack evaluation)] **CRG hybrid-search provider mismatch bug.** `code_review_gr... #230

Description

@jon-devlapaz

CRG hybrid-search provider mismatch bug. code_review_graph/search.py:184 instantiates EmbeddingStore(store.db_path, model=model) for query-time encoding, ignoring the provider column already stored in the embeddings table per row. Result: when nodes were embedded with cloud (Google text-embedding-004, 768-dim), query encoding still tries LocalEmbeddingProvider (sentence-transformers, 384-dim) and fails with "sentence-transformers not installed". Bench on 2026-05-04 confirmed: 1340 Google embeddings stored in .code-review-graph/graph.db, but every hybrid query logs Embedding search failed: sentence-transformers not installed and falls back to FTS5-only. Two fixes possible: (a) patch _embedding_search in search.py to read the stored provider from the embeddings table before instantiating the query-side EmbeddingStore (~10 lines, PR upstream to https://github.com/tirth8205/code-review-graph), or (b) install pip install code-review-graph[embeddings] and re-embed with local all-MiniLM-L6-v2 (sentence-transformers + pytorch ~700 MB into pyenv 3.13.12). Until fixed, vector search is dormant and semantic_search_nodes_tool is FTS5-only on this repo — which is empirically still good for symbol queries (see AGENTS.md "Practical query routing"), so this is not blocking, just suboptimal.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions