Document References Extraction System#1025
Document References Extraction System#1025busbyk wants to merge 16 commits intorefactor/esm-test-supportfrom
Conversation
Tests for the planned extractDocumentReferences function that will walk a document's content tree and extract all relationship/upload references into a flat array for the unified documentReferences revalidation field. Uses real block configs and collection configs from the codebase (not synthetic mocks) so tests break when configs change. The BlocksFeature mock captures its blocks arg into serverFeatureProps.blocks, allowing the extraction function to derive block mappings from config introspection rather than needing them passed in. Includes JSON fixtures modeled on real dev database content structures: - page-about-us-layout: deep nesting (ContentBlock > CalloutBlock > ButtonBlock) - page-who-we-are-layout: direct TeamBlocks in layout - page-supporters-layout: hasMany SponsorsBlocks with 47 sponsor refs - post-with-media-block: post with featuredImage + MediaBlock in richText Tests are skipped (describe.skip) until implementation exists. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add the core extraction function that recursively walks a Payload document's field config tree and data to find all relationship/upload references at any nesting depth. This handles the deep nesting gap where blocks inside richText inside blocks were previously invisible to the revalidation system. Handles: relationship, upload, blocks, richText (Lexical), group, array, row, collapsible, and tabs fields. Deduplicates on collection + docId. Also updates the test suite: removes placeholder stub, imports real implementation, unskips tests, fixes Events unnamed group test data and post fixture dedup assertion. All 41 tests passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ocumentReferences Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…onship queries eventsBlockMappings was using postsBlocks instead of eventsBlocks, and the events collection was completely omitted from relationship reference tracking and queries. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…outable collections
Create a reusable documentReferencesField (array with collection, docId, blockType,
fieldPath sub-fields) and a generic populateDocumentReferences beforeChange hook that
calls extractDocumentReferences to walk the full document tree. Wire both into all 4
routable collections: Pages, Posts, HomePages, Events.
The field is hidden by default — only visible to super admins who opt in via
localStorage.setItem('showDocRefs', '1').
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the conditional visibility logic (super admin + localStorage check) with simple disabled: true to match the blocksInContent field pattern. The field data remains accessible via API responses. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…cuments Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Preview deployment: https://revalidation.preview.avy-fx.org |
| * so it handles all field types including richText with BlocksFeature at any | ||
| * nesting depth. | ||
| */ | ||
| export const populateDocumentReferences: CollectionBeforeChangeHook = ({ data, collection }) => { |
There was a problem hiding this comment.
Is it possible for there to be a cycle in the references, and, if so, what happens?
There was a problem hiding this comment.
I don't believe it's possible for this to cycle. This is a before change hook so we can modify data here and that does not re-trigger hooks.
Cycling is probably more of a concern in the next PR #1026 but that's also not re-triggering hooks because the revalidation system is only calling Next.js' revalidatePath which does not change any data in the db or interact with the Payload Local API or Rest API besides find queries.
Good thought though and we should ensure this isn't possible. So I added a test for that in #1026: ce5e8d8 (that PR is still in draft atm fyi so hold off on full review until I mark ready).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Description
The foundation for unified document reference revalidation — a single documentReferences field on all routable collections that replaces the two parallel revalidation subsystems (block reference tracking in
richTexttype fields + relationship reference tracking). This PR adds the field, the extraction system, and a backfill migration. The old system remains in place and continues to drive revalidation; the new system runs alongside it temporarily for comparison.This is the first phase. #1026 implements the revalidation logic based on querying
documentReferencesand removes the old system.Related Issues
First step for #455.
Key Changes
hooks.
How to test
Screenshots / Demo video
https://www.loom.com/share/1e190bb2960445adb5f80492a070321a
Migration Explanation
20260404_012604_add_documentReferences_field - Adds the documentReferences field to routable collections.
20260404_015415_backfill_document_references — Data-only migration (no schema change). Iterates all published documents in pages, posts, homePages, and events, performing a no-op update (data: {}) on each. This triggers the populateDocumentReferences beforeChange hook, populating the new documentReferences field. Uses context.disableRevalidate to prevent cascading revalidation during the backfill. Draft documents are not backfilled.
Future enhancements / Questions
The next PR will:
Restricting lookups to configured blocks
When walking a richText field's Lexical AST, the function builds a Map<slug, Block> from the
BlocksFeature'sblocksarray (i.e. the allowed blocks). i.e.:For each
type: 'block'node in the AST, it looks upfields.blockTypein the map. If the blockType isn't in the allowed blocks, it's skipped (the "unknown blockType" edge case tests). If it IS found, the Block's fields schema is used to recursively extract references from that block's data.This feels like the correct decision but I felt like it was worth noting in case someone sees an issue with this logic.