fix(mcp): dedupe one repo's two path spellings onto one DB connection (#1057)#1082
Open
inth3shadows wants to merge 1 commit into
Open
fix(mcp): dedupe one repo's two path spellings onto one DB connection (#1057)#1082inth3shadows wants to merge 1 commit into
inth3shadows wants to merge 1 commit into
Conversation
…ath (colbymchenry#1057) Two spellings of one repo — a symlinked checkout, or upper/lowercase variants of a path on a case-insensitive mount (Windows NTFS, WSL DrvFs /mnt) — resolved to two different cache keys, so the MCP server opened a second SQLite connection to the same .codegraph/codegraph.db; concurrent writes then corrupted the index. Key projectCache (and the default-instance reuse check) on (dev,ino) filesystem identity, which is identical for every spelling. realpath alone is insufficient: on a case-insensitive, case-preserving filesystem it returns the caller's casing and cannot dedupe case-variants. Mirrors the existing inode-identity pattern in DatabaseConnection.openedInode. Adds __tests__/root-identity.test.ts (symlink case is a deterministic, FS-agnostic proxy for the case-insensitive-mount scenario).
This was referenced Jun 30, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem (#1057)
On WSL, opening a repo under
/mntvia two different path spellings — most concretely an upper- vs lowercase variant of the same path — corrupts.codegraph, even withCODEGRAPH_NO_DAEMONset. The same class of bug hits a symlinked checkout on any platform.Root cause
ToolHandler.getCodeGraphcaches each openCodeGraph(a live SQLite connection) inprojectCache, and reuses the default instance, keyed on the resolved-root path string. Two spellings of one physical directory resolve to two different strings → two cache entries → two SQLite connections to the same.codegraph/codegraph.db. Concurrent writes across the two connections corrupt the index. (This is the same second-connection hazard already documented at the#238comment in that method.)Why not just
realpathSyncthe rootThat was my first attempt, and it is insufficient — verified on a real WSL DrvFs mount:
realpathSyncresolves symlinks and./.., but on a case-insensitive case-preserving filesystem it returns the caller's casing, so it cannot dedupe case-variants. Filesystem identity(dev, ino)is identical for every spelling and is the robust key.Fix
Key
projectCacheand the default-instance reuse check on(dev, ino)via a newcanonicalRootKey()helper. This mirrors the inode-identity pattern this codebase already uses inDatabaseConnection.openedInode(statInode) for replace-on-disk detection, so it's idiomatic rather than novel. Minimal blast radius:findNearestCodeGraphRootitself is unchanged.Reproduced (WSL2 / Ubuntu)
Before fix, against the built resolver:
After fix,
canonicalRootKeyconverges:Tests
__tests__/root-identity.test.ts(4 tests). The symlink case is a deterministic, filesystem-agnostic proxy for the case-insensitive-mount scenario (both produce two path strings for one inode), so it runs on case-sensitive CI. Full suite green locally (the only failures were two pre-existing CPU-contention timing flakes —query-pool,mcp-daemon— that pass in isolation).Scope / follow-up
This fixes the in-process connection cache — the reported
CODEGRAPH_NO_DAEMONpath.daemon-registry.tsalso hashes the path string rather than(dev, ino), so two spellings can produce two discovery entries under~/.codegraph/daemons/for the same daemon. That's a smaller, separate bug than this PR's — the actual daemon-start arbitration (daemon.ts'sacquireLockViaExclusiveOpen, an O_CREAT|O_EXCL create on a real file inside the project's own.codegraph/) is already safe against path-spelling variance, since the OS resolves any spelling to the same physical file. So the with-daemon path doesn't get the same connection-level corruption this PR fixes — worst case is a cosmetic duplicate/stale entry incodegraph list/stop --all. Not expanding this PR for it.Validated on WSL2 (Ubuntu). Happy to adjust naming/placement to your preference.
Edit: Softened the "Scope / follow-up" section above — my original wording called this "the same class of issue" as the corruption bug this PR fixes. On a closer look at
daemon.ts's lock arbitration, it isn't: the daemon-start path is already safe by construction, and the only actual gap is the cosmetic duplicate-listing one described above. Didn't want the overstatement to stand uncorrected.