Skip to content

fix(mcp): tolerate concurrent registry migrations#4234

Open
YOMXXX wants to merge 3 commits into
tinyhumansai:mainfrom
YOMXXX:fix/GH-4194-mcp-migration-race
Open

fix(mcp): tolerate concurrent registry migrations#4234
YOMXXX wants to merge 3 commits into
tinyhumansai:mainfrom
YOMXXX:fix/GH-4194-mcp-migration-race

Conversation

@YOMXXX

@YOMXXX YOMXXX commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Makes MCP registry additive SQLite migrations tolerant of duplicate-column races.
  • Re-checks mcp_servers columns after ALTER TABLE ADD COLUMN failures and treats the migration as successful when the desired column is now present.
  • Adds grep-friendly debug/warn logging around additive schema migration branches.
  • Adds a regression test for stale PRAGMA table_info snapshots hitting an already-added deployment_url column.
  • Fixes stale links in translated README docs that the PR markdown-link CI scans.

Problem

  • The MCP Servers page can surface Failed to add deployment_url column to mcp_servers on load.
  • The additive migration was serially idempotent, but a stale column snapshot can still attempt ALTER TABLE ADD COLUMN deployment_url after another connection has already added it.
  • SQLite does not support ALTER TABLE ADD COLUMN IF NOT EXISTS, so the duplicate-column path needs explicit handling.
  • PR CI also checks translated docs and was blocked by pre-existing stale external links in five localized README files.

Solution

  • Routes each post-schema mcp_servers column addition through add_mcp_servers_column_if_missing.
  • Keeps the existing fast path for columns already present in the initial PRAGMA table_info snapshot.
  • On ALTER TABLE failure, refreshes the column list; if the target column exists, logs the observed concurrent/stale-snapshot path and continues.
  • Preserves the original contextual error when the target column still does not exist or column refresh fails.
  • Updates localized README links for the current native voice docs path, removes the Reddit link that lychee rejects with 403, and points the Star History wrapper link at the SVG URL already used by the image.

Submission Checklist

If a section does not apply to this change, mark the item as N/A with a one-line reason. Do not delete items.

  • Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
  • Diff coverage ≥ 80% — targeted Rust tests cover the changed migration paths; CI remains the authoritative diff-cover gate.
  • Coverage matrix updated — N/A: persistence bug fix and docs link cleanup, no feature row added/removed/renamed.
  • All affected feature IDs from the matrix are listed in the PR description under ## Related — N/A: no coverage-matrix feature ID applies to this persistence migration fix.
  • No new external network dependencies introduced (mock backend used per Testing Strategy)
  • Manual smoke checklist updated if this touches release-cut surfaces (docs/RELEASE-MANUAL-SMOKE.md) — N/A: no release manual smoke surface changed.
  • Linked issue closed via Closes #NNN in the ## Related section

Impact

  • Runtime/platform impact: desktop/core persistence only.
  • Migration compatibility: legacy MCP registry DBs still get transport, deployment_url, and enabled added as before.
  • Failure behavior: duplicate-column/stale-snapshot migration races no longer bubble to UI when the desired column exists after refresh.
  • Docs impact: five localized README files now avoid stale/CI-blocking external links.
  • Security/network: no secret handling or external network behavior changed.

Related


AI Authored PR Metadata (required for Codex/Linear PRs)

Keep this section for AI-authored PRs. For human-only PRs, mark each field N/A.

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: fix/GH-4194-mcp-migration-race
  • Commit SHA: 8fd5768fc, 58440321d, 82c235419

Validation Run

  • pnpm --filter openhuman-app format:check — N/A: frontend workspace not changed.
  • pnpm typecheck — N/A: TypeScript/frontend not changed.
  • Focused tests: GGML_NATIVE=OFF cargo test --manifest-path Cargo.toml --lib additive_column_migration_tolerates_duplicate_column_from_stale_snapshot
  • Focused tests: GGML_NATIVE=OFF cargo test --manifest-path Cargo.toml --lib mcp_registry::store::tests
  • Rust check: GGML_NATIVE=OFF cargo check --manifest-path Cargo.toml
  • Rust fmt/check (if changed): cargo fmt --manifest-path Cargo.toml --check
  • Rust/docs whitespace check: git diff --check
  • Docs link precheck: rg -n "tinyhumans\\.gitbook\\.io/openhuman/features/voice|reddit\\.com/r/tinyhumansai|star-history\\.com/#tinyhumansai/openhuman" docs/README.*.md returned no matches.
  • Tauri fmt/check (if changed): N/A: Tauri crate not changed.

Validation Blocked

  • command: N/A
  • error: N/A
  • impact: N/A

Behavior Changes

  • Intended behavior change: MCP registry schema init now tolerates stale column snapshots when an additive column has already been created by another migration attempt.
  • User-visible effect: MCP Servers page should no longer show the duplicate deployment_url migration failure when the column already exists.

Parity Contract

  • Legacy behavior preserved: existing serial idempotency and legacy-row defaults for stdio transport/enabled state remain covered by existing tests.
  • Guard/fallback/dispatch parity checks: duplicate-column tolerance only succeeds after refreshing schema and confirming the target column exists.

Duplicate / Superseded PR Handling

  • Duplicate PR(s): None found for this branch.
  • Canonical PR: This PR.
  • Resolution (closed/superseded/updated): N/A

Summary by CodeRabbit

  • Bug Fixes
    • Improved database schema upgrades for mcp_servers by making column additions safely idempotent during concurrent/duplicate migration attempts.
    • Reduced startup/upgrade failures when schema snapshots are stale and columns already exist.
  • Tests
    • Added a unit test to simulate stale schema snapshots and verify deployment_url is added correctly.
  • Documentation
    • Updated localized README social and feature links: “Reddit” is now plain text, Star History embeds now use the generated SVG link, and the voice documentation URL was adjusted.

@YOMXXX YOMXXX requested a review from a team June 28, 2026 10:13
@coderabbitai

coderabbitai Bot commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 455dd4a0-5d2c-46eb-aa76-fb8a179a3181

📥 Commits

Reviewing files that changed from the base of the PR and between 5844032 and 82c2354.

📒 Files selected for processing (1)
  • src/openhuman/mcp_registry/store.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/openhuman/mcp_registry/store.rs

📝 Walkthrough

Walkthrough

store.rs now uses a helper to add mcp_servers columns with stale-snapshot retry handling and includes a test for that case. The localized README files also update the Reddit, voice documentation, and Star History links.

Changes

Idempotent mcp_servers column migration

Layer / File(s) Summary
Schema migration helper and retry handling
src/openhuman/mcp_registry/store.rs
add_mcp_servers_column_if_missing replaces inline ALTER TABLE ... ADD COLUMN checks, skips columns already present in the snapshot, and treats a live-schema match after an ALTER failure as success.
Stale snapshot test
src/openhuman/mcp_registry/store.rs
A unit test calls the additive column helper with an outdated column snapshot and verifies the column exists afterward without surfacing an error.

README link target updates

Layer / File(s) Summary
Reddit social link text changes
docs/README.de.md, docs/README.ja-JP.md, docs/README.ko.md, docs/README.ur-pk.md, docs/README.zh-CN.md
The Reddit entry in the top social-links row is changed from a hyperlink to plain text across the localized READMEs.
Voice documentation link target
docs/README.de.md, docs/README.ja-JP.md, docs/README.ko.md, docs/README.ur-pk.md, docs/README.zh-CN.md
The native voice feature link target is updated in the feature-list bullet across the localized READMEs.
Star History chart URL
docs/README.de.md, docs/README.ja-JP.md, docs/README.ko.md, docs/README.ur-pk.md, docs/README.zh-CN.md
The Star History chart link is changed to an api.star-history.com/svg URL across the localized READMEs.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • tinyhumansai/openhuman#3387: Both PRs touch the documentation, including docs/README.ur-pk.md, and change localized README content/link wiring.

Poem

🐇 Hop, hop — the schema found its way,
Old snapshots no longer spoil the day.
The links all twinkle in every tongue,
And Star History sings where the chart is hung.
No red banner, just a quiet bloom,
A tidy hare-hop through code and room.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning The localized README link updates are unrelated to the MCP migration bug and add non-functional scope to the PR. Move the README link cleanup into a separate documentation PR so this change stays focused on the MCP migration fix.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly reflects the main change: making MCP registry migrations tolerate concurrent duplicate-column races.
Linked Issues check ✅ Passed The SQLite migration helper makes the deployment_url path idempotent and avoids duplicate-column errors, matching the bug fix requirements.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/openhuman/mcp_registry/store.rs (1)

136-150: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add logging to the genuine-failure branch.

The race/idempotent paths log via trace!/debug!/warn!, but the real-error branch at Line 143 (Ok(_) => Err(err).context(context) — ALTER failed and the column is confirmed still absent) propagates the error silently. This is the path that surfaces an actual migration failure to the UI, so it's the most useful one to log. As per coding guidelines: "Add debug logging to entry/exit, branches, external calls, retries/timeouts, state transitions, and errors using log/tracing".

♻️ Proposed logging on the genuine-error branch
-            Ok(_) => Err(err).context(context),
+            Ok(_) => {
+                tracing::debug!(
+                    "[mcp-registry] schema add column failed and still absent column={column} error={err}"
+                );
+                Err(err).context(context)
+            }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/mcp_registry/store.rs` around lines 136 - 150, The
genuine-failure path in the mcp registry schema-add logic is missing an error
log, so add a tracing log in the Ok(_) => Err(err).context(context) branch
inside the mcp_servers_columns match in store.rs. Use the existing
column/context variables and include the original err details before propagating
so the failed ALTER COLUMN case is visible in logs, while keeping the existing
debug/warn behavior for the idempotent and refresh-failure branches.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/openhuman/mcp_registry/store.rs`:
- Around line 136-150: The genuine-failure path in the mcp registry schema-add
logic is missing an error log, so add a tracing log in the Ok(_) =>
Err(err).context(context) branch inside the mcp_servers_columns match in
store.rs. Use the existing column/context variables and include the original err
details before propagating so the failed ALTER COLUMN case is visible in logs,
while keeping the existing debug/warn behavior for the idempotent and
refresh-failure branches.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d3ff0d69-bb92-48a6-875c-9cf0a0f4acf7

📥 Commits

Reviewing files that changed from the base of the PR and between 5a41a4f and 8fd5768.

📒 Files selected for processing (1)
  • src/openhuman/mcp_registry/store.rs

coderabbitai[bot]
coderabbitai Bot previously approved these changes Jun 28, 2026
@YOMXXX

YOMXXX commented Jun 28, 2026

Copy link
Copy Markdown
Contributor Author

CodeRabbit flagged the localized README link cleanup as potentially out of scope. I kept it in this PR because the required Markdown Link Check scans these docs on this PR and was failing on pre-existing stale links, which left the MCP migration fix red. The docs changes are limited to the exact five files and link categories reported by lychee: the old native voice docs URL, Reddit's 403 response, and the Star History fragment URL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: MCP Servers page shows "Failed to add deployment_url column to mcp_servers" on every load

1 participant