fix(gatsby-source-strapi): nested relation cache invalidation#524
fix(gatsby-source-strapi): nested relation cache invalidation#524molund wants to merge 2 commits into
Conversation
|
|
Thanks for the PR! I'll take a look next week likely or maybe @laurenskling will have time as he's the resident Strapi expert. |
laurenskling
left a comment
There was a problem hiding this comment.
Interesting.. I have this usecase myself that when a user adds a piece of content and relates it to Page, gatsby doesn't update that Page because it did not receive an update itself. I guess this PR is trying to fix that?
|
|
||
| // recursive function to traverse the entire data object | ||
| function traverse(node) { | ||
| if (Array.isArray(node)) { |
There was a problem hiding this comment.
Thinking out loud here.. we assume an array is always relations? What else do we have in strapi that could be arrays... images? I guess we would want this on image dates as well.. something else?
There was a problem hiding this comment.
For this change, our goal is strictly to derive the maximum updatedAt and publishedAt values from nested relations. The traversal logic will go into other content structures (components, dynamic zones, media, JSON), but we’ll only collect dates from nested objects with updatedAt / publishedAt fields. If other nested objects happen to use the same updatedAt / publishedAt naming convention, that would simply be an added bonus.
We don't really use the Strapi media library to its full ability on bcparks.ca so I'm not overly familiar with what the image responses might look like. But it won't break the generic traversal logic.
| return { | ||
| ...cleanAttributes(getAttributes(data, version), currentContentTypeSchema, schemas, version), | ||
| ...cleaned, | ||
| updatedAt: latest.updatedAt || cleaned.updatedAt || undefined, |
There was a problem hiding this comment.
cleaned.updatedAt is redundant here, right? As it will always be the latest, because it went thru findLatestDates, right?
There was a problem hiding this comment.
If I make the change suggested above here https://github.com/gatsby-uc/plugins/pull/524/changes#r2935638874 then this comment can be resolved.
| version, | ||
| ); | ||
| const latest = findLatestDates(cleaned); | ||
| return { |
There was a problem hiding this comment.
Should we actually mutate the updatedAt and publishedAt here? Maybe someone is sorting on one of these fields, I'm not sure if we want them to be changed. Isn't the goal achieved be adding a new value? Like adding changedAt with a result from findLatestDates (now we would need only one?) will still shake the caching right?
There was a problem hiding this comment.
I’m not familiar with the internals of Gatsby’s caching, but if changedAt would also invalidate the digest, then it’s probably the less destructive option. With Strapi 5, publishedAt has become much more important, compared to Strapi 4 where it was less significant. A combined field should therefore take the maximum value of both.
I would probably prefer something more generic like strapi_timestamp, to align with strapi_id and strapi_component, which already exist in the Gatsby GraphQL schema. Prefixing will prevent name collisions with user‑defined fields.
If you’re in agreement with strapi_timestamp, I’ll push a change.
There was a problem hiding this comment.
I think we need to be sure that this will actually trigger Gatsby's caching. As I'm not sure either. Maybe @moonmeister is more skilled in this area?
There was a problem hiding this comment.
strapi_timestamp sounds good to me
Exactly. bcparks.ca has a very deeply nested populate structure on the |
|
Possibly ignorant question here, is there a reason we don't treat these items as their own nodes instead of nesting within the parent? |
We could treat them as their own nodes (which the plugin already supports) but then we’d need to stitch them back together at render time. The deep queries populate approach avoids that by treating the parent as a materialized “view” and asking the Strapi API to populate the relations, so we get one already‑joined payload per record and generate one top‑level node per Gatsby page. |
This is not really true. https://github.com/gatsby-uc/plugins/blob/main/packages/gatsby-source-strapi/src/normalize.js#L216 relations are always turned into nodes. You can top-level query your relations now in Gatsby. It's irrelevant that you've used the deep query populate (which is also the only way to get relations..). They are stitched back together, with but as your suggested change is at the clean-data stage, it's when we fetch, before we create nodes, so this field will be added. It will only work if you populate the |
|
I'm going to close this PR because I discovered that my fix doesn't actually populate the nested relations. I tried to fix the code that populates the nested relations in The code for populating relations is insert-only with no working upsert behaviour When I try to implement upsert behaviour there are minor schema differences that can't be resolved. It might be an issue with the timing of the schema auto-discovery but I'm not sure. |
|
@molund if I read between the lines about the issue you are experiencing, is it that nested queried data doesn't update at all, right? What you want to do is only fetch the The graphql schema will link these two together, thus making it possible to use nested data in your graphql queries. You don't need to do that at fetching time. It's even better to not do it, as it will result into the same Node being merged together, possibly mismatching and losing data (like requesting less fields on a second or third fetch query). With this, updates will update that root content and nested graphql qeuries will receive that as well. |
Description
Gatsby's cache invalidation for
gatsby-source-strapionly considered the top-levelupdatedAt/publishedAtfields on a Strapi entry. When a nested relation was updated in Strapi, the parent entry's top-level timestamps remained unchanged — causing Gatsby to treat the entry as unmodified and serve stale data.Solution
Added a
findLatestDatesutility to clean-data.js that recursively traverses the entire cleaned entry — including all nested objects and arrays — and returns the maximum updatedAt and publishedAt values found anywhere in the document tree.cleanData now overwrites the top-level updatedAt and publishedAt with these maximums before returning, ensuring that any change to any relation at any depth of nesting will correctly bust the cache.
Compatible with both Strapi v4 and v5 because
findLatestDatesoperates on the output ofcleanAttributes, not on the raw API response.Related Issues
Fixes #523