Add query_documents for advanced document querying#71
Conversation
🦋 Changeset detectedLatest commit: 463351a The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughA new Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR adds a new MCP tool (query_documents) to expose Paperless-NGX’s richer /api/documents/ query capabilities (full-text query, custom field query, and an allowlisted set of documented filters), while keeping list_documents focused on simple listing and retaining search_documents as a compatibility wrapper.
Changes:
- Added a shared query builder/validator (
buildDocumentQueryString,custom_field_queryvalidation, and allowlistedpaperless_filters). - Introduced
query_documentsand refactoredlist_documents/search_documentsto share the same execution path. - Added test coverage for query serialization, validation, and OpenAPI allowlist sync; updated README usage examples and
.gitignore.
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/tools/utils/documentQuery.ts | Implements validated query argument shapes and builds a safe/allowlisted query string for /api/documents/. |
| src/tools/documents.ts | Refactors document retrieval to use a shared query executor and adds the query_documents tool. |
| src/tools/documents.test.ts | Adds tests for query string building, custom field query validation, and allowlist sync with the OpenAPI spec. |
| README.md | Documents query_documents as the canonical advanced query tool and updates list_documents / search_documents guidance. |
| .gitignore | Adds ignores for local dev artifacts (bun.lock, temp/). |
|
@baruchiro - I'm noting partial overlap with #70. This PR is narrower in overall scope, but different in query design:
So I do not think this is a duplicate, but there is overlap in If #70 lands first, I can rebase this PR on top of it and adapt the implementation to the new annotations/test-helper shape. The main open question is API direction: extending |
baruchiro
left a comment
There was a problem hiding this comment.
The overall structure is good: extracting buildDocumentQueryString into its own module is clean, the FIRST_CLASS_QUERY_PARAM_MAP approach is solid, and the OpenAPI-sync test is a clever guard. A few things need attention before merging.
Dead code
PaperlessAPI.searchDocuments is now unreachable.
After this PR the search_documents tool routes through executeDocumentQuery → api.getDocuments(), so api.searchDocuments() is never called. Remove it from PaperlessAPI.ts.
queryDocumentsArgsSchema is exported but unused.
documentQuery.ts exports queryDocumentsArgsSchema = z.object(QUERY_DOCUMENTS_ARGS_SHAPE) but nothing imports it. Either use it where the shape is consumed or drop the export.
Redundant tests
Per project convention, tests should check real input/output. Several tests here duplicate what the buildDocumentQueryString unit tests already cover, or assert on schema metadata rather than behaviour.
list_documents keeps existing simple query behavior — The parameter-to-query-string mapping is already exercised by serializes first-class list filters using Paperless parameter names. Going through the tool handler adds no new signal; the handler is a single return executeDocumentQuery(api, args) line.
search_documents remains a query-only compatibility wrapper — This asserts on Object.keys(schema) and a description regex. Neither is an I/O check. The actual query forwarding (one param, same key) is trivially covered by the serializer tests.
query_documents exposes advanced query fields and uses shared execution — The schema-key and description-regex assertions (assert.ok("custom_field_query" in ...), assert.match(...description, /custom field/i)) are registration tests, not behaviour tests. The query execution part overlaps with the serializer unit tests. Trim this down to the execution path, or remove it entirely if nothing distinct is being tested.
If custom_field_query JSON schema avoids tuple-style items arrays is also removed (it tests zod-to-json-schema's output format, an internal detail), the zod-to-json-schema dev dependency can be dropped along with it.
Minor
isCustomFieldQuery accepts group operators as leaf field names.
["AND", "iexact", "foo"] passes the leaf branch (length === 3, both strings, valid value) before the group branch is checked. In practice Paperless field names won't be "AND"/"OR", but adding an early exclusion makes the intent explicit:
// leaf branch
if (
value.length === 3 &&
typeof value[0] === "string" &&
!CUSTOM_FIELD_QUERY_GROUP_OPERATORS.includes(value[0] as any) &&
typeof value[1] === "string" &&
isCustomFieldQueryValue(value[2])
) { … }Generated by Claude Code
|
Updated after review. Changes:
Verification:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/tools/documents.test.ts`:
- Around line 26-31: In the getDocumentQueryParamsFromOpenApi function, add
explicit checks for the indexOf results before using them to slice the section.
Both the start index (from indexOf for "/api/documents/:") and end index (from
indexOf for "/api/documents/{id}/:") should be validated to ensure they are not
-1, which indicates the marker was not found. If either index is -1, throw an
error with a clear message that identifies which marker was missing, then
proceed with the text.slice operation only after both validations pass.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 704df819-27ad-4a7f-a813-ae0ff913b7c2
📒 Files selected for processing (5)
README.mdsrc/api/PaperlessAPI.tssrc/tools/documents.test.tssrc/tools/documents.tssrc/tools/utils/documentQuery.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- src/tools/documents.ts
Summary
Expose Paperless' richer document query capabilities through MCP so agents can filter documents server-side instead of scanning broad result sets client-side.
query_documentsas the canonical document query toollist_documentsfocused on simple listingsearch_documentsas a compatibility wrappercustom_field_queryand validatedpaperless_filters.gitignorefor local development filesVerification
npm testnpm run buildSummary by CodeRabbit
Release Notes
New Features
query_documentstool for advanced document querying with full-text search and structured Paperless-style filters (including custom field filtering).Documentation
list_documentsdocumentation with additional pagination, sorting, and filter options.search_documentsas a deprecated compatibility wrapper; recommended usingquery_documents.Refactor
Tests
Chores
bun.lockandtemp/.