Skip to content

feat(mcp): accept training-data-friendly parameter aliases#766

Merged
phernandez merged 3 commits intomainfrom
feat/mcp-param-aliases
Apr 29, 2026
Merged

feat(mcp): accept training-data-friendly parameter aliases#766
phernandez merged 3 commits intomainfrom
feat/mcp-param-aliases

Conversation

@phernandez
Copy link
Copy Markdown
Member

Summary

  • Models trained on common API conventions reach for offset/limit, find/old_text, folder/dir, etc. on first use of our MCP tools. Pydantic validation errors fire, the model loses trust in the tool, and falls back to raw filesystem reads — bypassing the knowledge graph entirely. (See Add Pydantic AliasChoices to MCP tool parameters #690 for the full diagnosis.)
  • Adds Pydantic v2 AliasChoices to parameters across 12 MCP tools. Canonical names remain the only thing advertised in the JSON schema, so models still learn the right name — but common alternatives are silently accepted at validation time as a safety net.
  • Zero breaking changes, zero runtime cost.

Tools updated

read_note, edit_note, search_notes, recent_activity, list_directory, write_note, move_note, delete_note, read_content, view_note, build_context, canvas

Aliases (selected highlights)

Canonical Aliases accepted Why
page / page_size offset, limit, page_number, per_page Universal pagination conventions
query q, search, text q is the HTTP convention
find_text find, old_text, old_content, search Highest-frequency miss per the issue
content (in edit_note) new_content, replacement, replace_with content is ambiguous as "replacement text"
section section_heading, heading Models prefer the more descriptive name
directory / dir_name folder, dir, path All three are interchangeable in training data
destination_path to, dest_path, new_path, destination Move/rename idioms
timeframe since, time_range, lookback Time-window naming varies wildly
is_directory is_dir Common abbreviation
url (build_context) uri, memory_url URL/URI confusion is endemic

The full alias map is in the commit message.

Design note

The advertised JSON schema continues to show only the canonical names. Aliases live entirely on the validation layer — silent safety net, not advertised API. This preserves the ability to deprecate aliases later without a breaking schema change. A schema-sanity test (test_aliases_not_advertised_in_schema) enforces this invariant.

Test plan

  • 19 new integration tests in test-int/mcp/test_param_aliases_integration.py covering each tool's aliases
  • Schema-sanity test asserts canonical names appear and aliases stay hidden across all 12 tools
  • Full MCP integration suite passes: 195 passed, 1 skipped, no regressions
  • ruff check and ty check clean

Closes #690

🤖 Generated with Claude Code

Models reach for common parameter names from training data (offset/limit
instead of page/page_size, find/old_text instead of find_text, etc.) on
first use of MCP tools. Pydantic validation errors fire, the model loses
trust in the tool, and falls back to filesystem reads — bypassing the
knowledge graph entirely.

Use Pydantic v2 AliasChoices on tool parameters so the canonical names
are still the only thing advertised in the JSON schema, but common
alternatives are silently accepted at validation time.

Aliases added across 12 tools:
  - pagination: page←offset/page_number, page_size←limit/per_page
  - search:     query←q/search/text, after_date←since/after/from_date,
                min_similarity←threshold/similarity_threshold,
                note_types←note_type/types
  - time:       timeframe←since/time_range/lookback
  - paths:      directory←folder/dir/path, dir_name←directory/folder/...,
                path←file_path/filepath/file
  - move:       destination_path←to/dest_path/new_path/destination
  - edit:       find_text←find/old_text/old_content/search,
                content←new_content/replacement/replace_with,
                section←section_heading/heading
  - flags:      overwrite←force/replace, is_directory←is_dir
  - context:    url←uri/memory_url, max_related←max_results

Closes #690

Signed-off-by: phernandez <paul@basicmachines.co>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 28, 2026

Claude finished @phernandez's task in 2m 19s —— View job


PR Review: feat(mcp): accept training-data-friendly parameter aliases

  • Read and analyze changed files
  • Review code quality & standards
  • Review testing & documentation
  • Review architecture compliance
  • Review security & performance
  • Review compatibility

Summary

This is a well-motivated, cleanly executed PR. The use of AliasChoices with validation_alias is idiomatic Pydantic v2 for exactly this use-case: accept common names at validation time without changing what the JSON schema advertises to models. The schema invariant test is an essential guard.


Code Quality & Standards ✅

Check Status Notes
Follows CLAUDE.md conventions Comments follow "why, not what" literate style
Python 3.12+ annotations Correct use of Annotated, AliasChoices, Field
SQLAlchemy 2.0 ✅ N/A No ORM changes
Line length ≤100 Spot-checked; all within limits
No commented-out code

The inline comments in edit_note.py are a good example of the literate style the project asks for:

# find_text is the highest-frequency miss per the issue: models reach for
# find/old_text/old_content/search before find_text every time.

Testing & Documentation ✅ (with minor gaps)

Check Status Notes
Integration tests for new behaviour 19 tests in test_param_aliases_integration.py
Schema invariant test test_aliases_not_advertised_in_schema is a critical guard
Canonical names still work test_*_canonical_names_still_work tests
100% coverage maintained ⚠️ See below

Three coverage gaps:

  1. delete_note has no functional integration test. The schema test verifies is_dir is hidden, but there's no test that actually sends is_dir=True to a delete_note call and confirms it routes correctly. Fix this →

  2. view_note has no functional alias test. Only the schema check covers it. Since view_note delegates directly to read_note(page=page, page_size=page_size), the aliases do work — but a quick test sending offset=1 to view_note would complete coverage. Fix this →

  3. build_context test is shallow — it asserts "no error" but doesn't verify the alias was actually resolved to the right URL. Given that the MemoryUrl resolver is exercised by other tests, this is a minor concern.


Architecture Compliance ✅

Check Status
MCP tools remain atomic & composable
No DB migrations needed
Local-first architecture intact
Knowledge graph consistency maintained

Security & Performance ✅

Check Status Notes
No hardcoded secrets
Input validation maintained Aliases don't bypass existing path traversal checks
Zero runtime cost Validation-time only; no additional I/O

The aliases are accepted only at Pydantic validation time and are then assigned to the canonical Python parameter, so all downstream validation (path traversal, type coercion, etc.) is completely unaffected.


Compatibility ✅

Check Status
Windows path comparisons ✅ N/A — no path comparison changes
No new emojis/unicode ✅ — pre-existing emoji usage in move_note.py, recent_activity.py, list_directory.py was not introduced by this PR

Minor Observations (not blockers)

  1. build_context has max_related with alias "limit_related" (build_context.py:162). This alias is unlikely to be used organically — "limit_related" is not a common convention. "max_results" (already in the list) is more natural. The extra alias is harmless but adds surface area.

  2. search_notes has "search" aliased to query, while edit_note has "search" aliased to find_text (search.py:308, edit_note.py:200). These are different tools so there's no conflict, but it's worth noting in case a model sends search= ambiguously to edit_note expecting it to match a note.

  3. recent_activity type parameter now has "types" and "kind" aliases (recent_activity.py:45). The parameter name type shadows Python's built-in — this is pre-existing and out of scope here, but worth flagging for a future cleanup.


Checklist

  • Follows Basic Memory's coding conventions in CLAUDE.md
  • Python 3.12+ type annotations and async patterns
  • SQLAlchemy 2.0 best practices (N/A)
  • FastAPI and Typer conventions followed (N/A)
  • 100-character line length limit maintained
  • No commented-out code blocks
  • Integration tests for new MCP tool aliases
  • Schema invariant test enforces aliases stay hidden
  • Test coverage for edge cases — minor gaps for delete_note and view_note aliases
  • [~] 100% test coverage — functional alias paths for delete_note and view_note are untested (schema-only)
  • Documentation updated (PR description is thorough)
  • CLAUDE.md unchanged (no convention changes)
  • MCP tools follow atomic, composable design
  • No database schema changes
  • Preserves local-first architecture principles
  • Knowledge graph operations maintain consistency
  • Markdown file handling unchanged
  • No hardcoded secrets or credentials
  • Input validation for MCP tools maintained
  • Zero performance impact
  • No sensitive data in logs or commits
  • Windows path compatibility unaffected
  • No new emojis/unicode in console output

Overall: ✅ Approve with minor suggestions. The two missing functional alias tests for delete_note and view_note are the only items I'd recommend addressing before merge.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 87285adf12

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/basic_memory/mcp/tools/search.py Outdated
page_size: int = 10,
page: Annotated[
int,
Field(default=1, validation_alias=AliasChoices("page", "offset", "page_number")),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Convert offset aliases to page numbers

Mapping offset directly onto page changes query semantics and returns the wrong slice for common inputs. In this codebase, downstream search pagination is page-based (offset = (page - 1) * page_size in the API layer), so a call like offset=20, limit=10 now fetches page 20 instead of items 21–30, and offset=0 can produce invalid page handling in other tools that validate page >= 1. If offset is accepted as an alias, it needs translation to a 1-indexed page value rather than direct binding.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in 07798a5. Dropped offset as an alias for page across all five paginated tools (read_note, search_notes, recent_activity, view_note, build_context) since the semantics differ (item-index vs. 1-indexed page-number). Kept the safe synonyms (page_number, limit, per_page). Added test_offset_is_not_aliased_to_page as a regression guard so this doesn't sneak back in.

Codex review caught a real bug: aliasing `offset` to `page` silently returns
the wrong slice. `offset` is item-indexed (skip N items); `page` is 1-indexed
page-number. A call like `offset=20, limit=10` would expect items 21-30 but
our binding turned it into page 20 → items 191-200.

The original issue (#690) listed `offset` as an alias for `page`, but the
semantics are different enough that silent acceptance is worse than rejection.
Dropped the alias across all five paginated tools (read_note, search_notes,
recent_activity, view_note, build_context) and added a regression-guard test
so a future contributor doesn't reintroduce it.

Other review feedback addressed:
  - dropped `limit_related` alias for `max_related` (not a natural convention,
    `max_results` already covers the use case)
  - added functional integration test for view_note pagination aliases
  - added functional integration test for delete_note `is_dir` alias

Kept aliases that are true synonyms: `page_number` (≡ page),
`limit`/`per_page` (≡ page_size, both mean "max items per response").

Signed-off-by: phernandez <paul@basicmachines.co>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 28, 2026

Claude finished @phernandez's task in 1m 49s —— View job


PR Review: feat(mcp): accept training-data-friendly parameter aliases

  • Read and analyze changed files
  • Review code quality & standards
  • Review testing & documentation
  • Review architecture compliance
  • Review security & performance
  • Review compatibility

Summary

This sync addresses all three concerns from the previous round:

  1. offset alias removed from page across all five paginated tools — the semantic mismatch (item-index vs. 1-indexed page-number) is now documented inline and locked behind a regression test.
  2. delete_note functional alias test added (test_delete_note_accepts_is_dir_alias).
  3. view_note functional alias test added (test_view_note_accepts_pagination_aliases).

Code Quality & Standards ✅

Check Status Notes
Python 3.12+ annotations Annotated, AliasChoices, Field — idiomatic throughout
Line length ≤100
Literate-style comments Every alias block explains the why
offset omission documented All five paginated tools carry the same inline rationale
No commented-out code

The inline comment is consistent and precise across all five paginated tools (read_note.py:78–80, search_notes:312–314, recent_activity.py:54–55, view_note.py:22–23, build_context.py:151–152):

# `offset` is intentionally NOT aliased: it has different semantics
# (item-indexed vs. 1-indexed page-number).

Testing & Documentation ✅

Check Status Notes
19 functional alias tests Each tool exercised with actual roundtrip
test_offset_is_not_aliased_to_page Regression guard locks out the bad alias
delete_note functional alias test is_dir=False now tested
view_note functional alias test page_number=1, limit=10 aliases tested
Schema sanity: test_aliases_not_advertised_in_schema Covers all 12 tools
Canonical names still work Separate canonical-names tests present

One minor observation (not a blocker): build_context.py:161–163 aliases max_related to "max_results" only — there is no "limit_related" alias in the code. The schema test's must_not_have list includes "limit_related" anyway as a preemptive guard, which trivially passes. This is harmless and arguably useful as a future-contributor guard, but it's worth noting the asymmetry between code and test.


Architecture Compliance ✅

Check Status
MCP tools remain atomic & composable
Validation-layer only — no downstream changes
Knowledge graph / file handling unchanged
No DB migrations

Security & Performance ✅

Check Status Notes
Input validation maintained Aliases resolve to canonical Python params before any downstream validation
Path traversal checks unaffected validate_project_path runs on the resolved value
Zero runtime cost Pydantic validation-time only
No sensitive data exposure

Compatibility ✅

Check Status
Windows path comparisons unaffected
No new emojis/unicode in console/log output

Checklist

  • Follows Basic Memory's coding conventions in CLAUDE.md
  • Python 3.12+ type annotations and async patterns
  • SQLAlchemy 2.0 best practices (N/A)
  • FastAPI and Typer conventions followed (N/A)
  • 100-character line length limit maintained
  • No commented-out code blocks
  • Integration tests for new MCP tool aliases
  • Schema invariant test enforces aliases stay hidden
  • Test coverage for edge cases — delete_note and view_note functional alias tests added
  • 100% test coverage — all gaps from previous review addressed
  • offset semantic mismatch fixed and regression-guarded
  • Documentation updated (PR description is thorough)
  • MCP tools follow atomic, composable design
  • No database schema changes
  • Preserves local-first architecture principles
  • Knowledge graph operations maintain consistency
  • Markdown file handling unchanged
  • No hardcoded secrets or credentials
  • Input validation for MCP tools maintained
  • Zero performance impact
  • Windows path compatibility unaffected

Overall: ✅ Approve. All issues from the previous review have been addressed. The offset-is-not-page fix is correct and well-tested; the two missing functional alias tests are in place. Ready to merge.

@phernandez
Copy link
Copy Markdown
Member Author

Addressed both reviews in 07798a5:

Codex (P2 — bug): Dropped offset as alias for page across read_note, search_notes, recent_activity, view_note, build_context. The semantic mismatch was real — offset=20, limit=10 would have returned page 20 instead of items 21-30. Kept the safe synonyms (page_number, limit, per_page). Added test_offset_is_not_aliased_to_page to lock in the deliberate omission.

Claude review minors:

  • Dropped limit_related alias for max_related (not a natural convention; max_results already covers it)
  • Added functional integration test for view_note pagination aliases
  • Added functional integration test for delete_note is_dir alias
  • Acknowledged-not-acted: search aliased to different params across search_notes/edit_note (intentional — different tools, different scopes); type shadowing builtin in recent_activity (pre-existing, out of scope)

Test counts: 22 passed in the alias suite (was 19), full MCP integration suite still green.

…liases

Codex/Claude review noted an asymmetry: the schema-sanity test's
must_not_have lists include `offset` (for page) and `limit_related` (for
max_related) even though those aliases were never accepted in the code.
That's intentional — listing rejected aliases here acts as a regression
guard so a future contributor can't reintroduce them without the test
catching it. Made that intent explicit in the docstring so the entries
don't read as dead code.

No behavior change; doc-only.

Signed-off-by: phernandez <paul@basicmachines.co>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 29, 2026

Claude finished @phernandez's task in 2m 14s —— View job


PR Review: feat(mcp): accept training-data-friendly parameter aliases

  • Read and analyze changed files
  • Review code quality & standards
  • Review testing & documentation
  • Review architecture compliance
  • Review security & performance
  • Review compatibility

Summary

This is the third pass. Both blocking issues from previous rounds are resolved: the offsetpage semantic mismatch is fixed and regression-guarded, and the delete_note / view_note functional alias tests are now in place. The implementation is clean and complete.


Code Quality & Standards ✅

Check Status Notes
Follows CLAUDE.md conventions Literate-style "why" comments on every alias block
Python 3.12+ annotations Annotated, AliasChoices, Field — idiomatic throughout
SQLAlchemy 2.0 ✅ N/A No ORM changes
Line length ≤100
No commented-out code

The offset-omission comment is now consistent across all five paginated tools (read_note.py:78–80, search.py:312–314, recent_activity.py:54–55, view_note.py:22–23, build_context.py:151–152):

# `offset` is intentionally NOT aliased: it has different semantics
# (item-indexed vs. 1-indexed page-number).

Testing & Documentation ✅

Check Status Notes
22 functional alias tests Covers all 12 tools end-to-end
test_offset_is_not_aliased_to_page Regression guard locks the deliberate omission
delete_note is_dir alias test_delete_note_accepts_is_dir_alias added
view_note pagination aliases test_view_note_accepts_pagination_aliases (page_number=1, limit=10) added
Schema sanity: test_aliases_not_advertised_in_schema Covers all 12 tools; must_not_have lists include both accepted and deliberately-rejected aliases
Canonical names still work Separate canonical-names tests present
100% coverage All gaps from previous review addressed

One trivial observation (not a blocker): build_context test (test_build_context_accepts_url_aliases, line 514) asserts result.content or result.structured_content — it verifies the alias is accepted, but not that it resolved to the right URL. Given MemoryUrl validation is exercised by other tests this is an acceptable trade-off; a deeper assertion would require knowing the exact response shape at test time.


Architecture Compliance ✅

Check Status
MCP tools remain atomic & composable
Validation-layer only — no downstream changes
Knowledge graph / file handling unchanged
No DB migrations

move_note canonical destination_folder (move_note.py:362) is a genuine pre-existing canonical parameter (with dest_folder/to_folder aliases), correctly reflected in the schema test's must_have list. Not introduced by this PR; works correctly.


Security & Performance ✅

Check Status Notes
Input validation maintained Aliases resolve to canonical Python params before any downstream path/type validation
Path traversal checks unaffected validate_project_path runs on the resolved value
Zero runtime cost Pydantic validation-time only
No sensitive data exposure

Compatibility ✅

Check Status
Windows path comparisons unaffected
No new emojis/unicode in console/log output

Checklist

  • Follows Basic Memory's coding conventions in CLAUDE.md
  • Python 3.12+ type annotations and async patterns
  • SQLAlchemy 2.0 best practices (N/A)
  • FastAPI and Typer conventions followed (N/A)
  • 100-character line length limit maintained
  • No commented-out code blocks
  • Integration tests for new MCP tool aliases
  • Schema invariant test enforces aliases stay hidden
  • Test coverage for edge cases — all gaps addressed
  • 100% test coverage — all gaps from previous reviews resolved
  • offset semantic mismatch fixed and regression-guarded
  • Documentation updated (PR description is thorough)
  • MCP tools follow atomic, composable design
  • No database schema changes
  • Preserves local-first architecture principles
  • Knowledge graph operations maintain consistency
  • Markdown file handling unchanged
  • No hardcoded secrets or credentials
  • Input validation for MCP tools maintained
  • Zero performance impact
  • Windows path compatibility unaffected

Overall: ✅ Approve. All issues from prior reviews are resolved. The PR is well-motivated, cleanly implemented, and thoroughly tested. Ready to merge.

@phernandez phernandez merged commit ee1558e into main Apr 29, 2026
40 of 41 checks passed
@phernandez phernandez deleted the feat/mcp-param-aliases branch April 29, 2026 01:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Pydantic AliasChoices to MCP tool parameters

1 participant