Skip to content

fix(datatrak): RN-1819: detect descendant changes in permission group hierarchy#6708

Open
jaskfla wants to merge 11 commits intodevfrom
rn-1819-descentant-pg
Open

fix(datatrak): RN-1819: detect descendant changes in permission group hierarchy#6708
jaskfla wants to merge 11 commits intodevfrom
rn-1819-descentant-pg

Conversation

@jaskfla
Copy link
Copy Markdown
Contributor

@jaskfla jaskfla commented Apr 1, 2026

RN-1819

🦸 Review Hero

  • Run Review Hero
  • Auto-fix review suggestions
  • Auto-fix CI failures

Comment on lines +23 to +35
WITH RECURSIVE tree AS (
SELECT id, parent_id, 1 AS depth
FROM permission_group
WHERE parent_id IS NULL

UNION ALL

SELECT pg.id, pg.parent_id, t.depth + 1
FROM permission_group pg
INNER JOIN tree t ON pg.parent_id = t.id
)
SELECT max(depth) + :headroom AS height
FROM tree
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of today, the maximum permission group tree height is 7

@review-hero
Copy link
Copy Markdown

review-hero Bot commented Apr 1, 2026

🦸 Review Hero (could not post inline comments — showing here instead)

packages/sync/src/utils/hasPermissionGroupHierarchyChangeInSyncSnapshot.ts:68

[Bugs & Correctness] critical

:maxDepth is a Knex RawBuilder object, but executeSql (which calls this.connection.raw(sql, parametersToBind)) only supports primitive values as named bindings. Knex does not inline a nested RawBuilder when it appears as a value in a named-bindings object — it will either throw or serialize as [object Object]. If serialized, w.depth < '[object Object]' is always false, so the recursive walk never expands beyond the seed rows and the function returns false even when a descendant has changed. Fix: either compute maxDepth with a separate executeSql call first and pass the integer result, or inline the subquery directly into the template string without binding.


packages/sync/src/utils/hasPermissionGroupHierarchyChangeInSyncSnapshot.ts:49

[Bugs & Correctness] suggestion

The recursive walk CTE uses UNION ALL instead of UNION, so if the snapshot contains many changed permission groups that share ancestors, those shared ancestors are visited once per distinct path — not once total. For example, with 50 changed leaf nodes under a single root, the root is visited 50 times at depth N, producing 50×depth rows. PostgreSQL recursive CTEs support UNION to deduplicate rows within the recursion. Changing to UNION would prevent re-expanding already-seen (current_id, depth) pairs and is safer if the hierarchy ever contains cycles (bad data or mid-sync state). The EXISTS check is still correct with UNION ALL, but in large syncs this is wasteful and bounded only by maxDepth × |base nodes|.


packages/datatrak-web/src/sync/ClientSyncManager.ts:360

[Design & Architecture] suggestion

The two async wrapper lambdas exist solely to defer evaluation for short-circuit ||, but await already short-circuits naturally in sequence. Replace with direct calls:

if (
  await hasSyncSnapshotRecords(this.database, sessionId, undefined, this.models.userEntityPermission.databaseRecord, "data->>'user_id' = :userId", { userId: currentUserId }) ||
  await hasPermissionGroupHierarchyChangeInSyncSnapshot(this.database, sessionId, currentUserId)
) {

The wrappers add 12 lines of indirection with no benefit — lambdas wrapping a single awaited call is the same as just calling the function.


packages/sync/src/utils/hasPermissionGroupHierarchyChangeInSyncSnapshot.ts:40

[Design & Architecture] suggestion

The maxDepth tree-height CTE performs a full recursive scan of permission_group unconditionally, even when the snapshot contains zero permission_group records (the common case — nothing changed). The expensive scan should be guarded. One approach: add an early-exit CTE at the top of the main query:

changed_pgs AS (
  SELECT 1 FROM ${snapshotTable} WHERE record_type = :recordType LIMIT 1
),

and gate walk and maxDepth on EXISTS (SELECT 1 FROM changed_pgs). Alternatively, check record existence before calling this function at all — the caller already has hasSyncSnapshotRecords available.


packages/sync/src/utils/hasPermissionGroupHierarchyChangeInSyncSnapshot.ts:56

[Performance] suggestion

The LATERAL subquery inside the recursive walk CTE fires two separate SELECTs per recursion step: one against permission_group and one against the snapshot table. Combined with COALESCE, Postgres must evaluate both sides for every row at every depth level. Because the snapshot table's data column is JSONB, the filter s2.data ->> 'id' = w.current_id is almost certainly doing a sequential scan of all snapshot rows of type permission_group on each step (no functional index on a JSON extraction). For large snapshots or deep trees this becomes quadratic. A more efficient approach would be to materialise the snapshot's permission_group rows into a CTE once (extracting id/parent_id) at the start of the query, then join against that CTE rather than re-scanning the snapshot table per recursion step.

@review-hero
Copy link
Copy Markdown

review-hero Bot commented Apr 1, 2026

🦸 Review Hero Summary
12 agents reviewed this PR | 1 critical | 4 suggestions | 0 nitpicks | Filtering: consensus 3 voters, 4 below threshold

Below consensus threshold (4 unique issues not confirmed by majority)
Location Agent Severity Comment
packages/datatrak-web/src/sync/ClientSyncManager.ts:375 Performance nitpick The two async checks are evaluated sequentially due to || short-circuit semantics, which is intentional for the fast-path. However, if both checks will usually both return false (common case duri...
packages/datatrak-web/src/sync/ClientSyncManager.ts:376 Bugs & Correctness suggestion If hasPermissionGroupHierarchyChangeInSyncSnapshot throws (e.g. due to the maxDepth binding bug or a DB error), the entire checkForPermissionChanges call rejects and no `updatePermissionsChan...
packages/sync/src/utils/hasPermissionGroupHierarchyChangeInSyncSnapshot.ts:21 Bugs & Correctness suggestion If the permission_group table is empty (e.g. fresh device before any data is synced), max(depth) over an empty set returns NULL, so the subquery yields NULL. w.depth < NULL is always UNKNOWN ...
packages/sync/src/utils/hasPermissionGroupHierarchyChangeInSyncSnapshot.ts:47 Performance suggestion The walk seed starts from ALL permission_group records in the snapshot (no filter beyond record_type). If a sync session contains many permission_group changes, the recursive CTE expands each...
Local fix prompt (copy to your coding agent)

Fix these issues identified on the pull request. One commit per issue fixed.


packages/sync/src/utils/hasPermissionGroupHierarchyChangeInSyncSnapshot.ts:68: :maxDepth is a Knex RawBuilder object, but executeSql (which calls this.connection.raw(sql, parametersToBind)) only supports primitive values as named bindings. Knex does not inline a nested RawBuilder when it appears as a value in a named-bindings object — it will either throw or serialize as [object Object]. If serialized, w.depth &lt; '[object Object]' is always false, so the recursive walk never expands beyond the seed rows and the function returns false even when a descendant has changed. Fix: either compute maxDepth with a separate executeSql call first and pass the integer result, or inline the subquery directly into the template string without binding.


packages/sync/src/utils/hasPermissionGroupHierarchyChangeInSyncSnapshot.ts:49: The recursive walk CTE uses UNION ALL instead of UNION, so if the snapshot contains many changed permission groups that share ancestors, those shared ancestors are visited once per distinct path — not once total. For example, with 50 changed leaf nodes under a single root, the root is visited 50 times at depth N, producing 50×depth rows. PostgreSQL recursive CTEs support UNION to deduplicate rows within the recursion. Changing to UNION would prevent re-expanding already-seen (current_id, depth) pairs and is safer if the hierarchy ever contains cycles (bad data or mid-sync state). The EXISTS check is still correct with UNION ALL, but in large syncs this is wasteful and bounded only by maxDepth × |base nodes|.


packages/datatrak-web/src/sync/ClientSyncManager.ts:360: The two async wrapper lambdas exist solely to defer evaluation for short-circuit ||, but await already short-circuits naturally in sequence. Replace with direct calls:

ts if ( await hasSyncSnapshotRecords(this.database, sessionId, undefined, this.models.userEntityPermission.databaseRecord, "data->>'user_id' = :userId", { userId: currentUserId }) || await hasPermissionGroupHierarchyChangeInSyncSnapshot(this.database, sessionId, currentUserId) ) {

The wrappers add 12 lines of indirection with no benefit — lambdas wrapping a single awaited call is the same as just calling the function.


packages/sync/src/utils/hasPermissionGroupHierarchyChangeInSyncSnapshot.ts:40: The maxDepth tree-height CTE performs a full recursive scan of permission_group unconditionally, even when the snapshot contains zero permission_group records (the common case — nothing changed). The expensive scan should be guarded. One approach: add an early-exit CTE at the top of the main query:

sql changed_pgs AS ( SELECT 1 FROM ${snapshotTable} WHERE record_type = :recordType LIMIT 1 ),

and gate walk and maxDepth on EXISTS (SELECT 1 FROM changed_pgs). Alternatively, check record existence before calling this function at all — the caller already has hasSyncSnapshotRecords available.


packages/sync/src/utils/hasPermissionGroupHierarchyChangeInSyncSnapshot.ts:56: The LATERAL subquery inside the recursive walk CTE fires two separate SELECTs per recursion step: one against permission_group and one against the snapshot table. Combined with COALESCE, Postgres must evaluate both sides for every row at every depth level. Because the snapshot table's data column is JSONB, the filter s2.data ->> 'id' = w.current_id is almost certainly doing a sequential scan of all snapshot rows of type permission_group on each step (no functional index on a JSON extraction). For large snapshots or deep trees this becomes quadratic. A more efficient approach would be to materialise the snapshot's permission_group rows into a CTE once (extracting id/parent_id) at the start of the query, then join against that CTE rather than re-scanning the snapshot table per recursion step.

Comment thread packages/sync/src/utils/index.ts
Comment thread packages/datatrak-web/src/sync/ClientSyncManager.ts
@jaskfla
Copy link
Copy Markdown
Contributor Author

jaskfla commented Apr 2, 2026

packages/sync/src/utils/hasPermissionGroupHierarchyChangeInSyncSnapshot.ts:68

[Bugs & Correctness] critical

:maxDepth is a Knex RawBuilder object, but executeSql (which calls this.connection.raw(sql, parametersToBind)) only supports primitive values as named bindings. Knex does not inline a nested RawBuilder when it appears as a value in a named-bindings object — it will either throw or serialize as [object Object]. If serialized, w.depth < '[object Object]' is always false, so the recursive walk never expands beyond the seed rows and the function returns false even when a descendant has changed. Fix: either compute maxDepth with a separate executeSql call first and pass the integer result, or inline the subquery directly into the template string without binding.

@chris-bes Confident Review Hero is wrong about [object Object]. Knex explicitly supports inner queries like this; but Review Hero’s suggestion of templating inner query directly in also works. I’ve tested this and am happy to do that

- WHERE w.depth < :maxDepth
+ WHERE w.depth < (
+   WITH RECURSIVE tree AS (
+     SELECT id, parent_id, 1 AS depth
+     FROM permission_group
+     WHERE parent_id IS NULL
+ 
+     UNION ALL
+ 
+     SELECT pg.id, pg.parent_id, t.depth + 1
+     FROM permission_group pg
+     INNER JOIN tree t ON pg.parent_id = t.id
+   )
+   SELECT max(depth) + :headroom AS max_recursion_depth
+   FROM tree
+ )

Copy link
Copy Markdown
Contributor

@chris-bes chris-bes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit c1a1ec2. Configure here.

FROM tree
)`,
{ headroom: 10 },
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raw builder object used as named binding value

High Severity

maxDepth is assigned a Knex Raw builder (the unawaited return of database.connection.raw(...)) and then passed as a value inside a named-bindings object to executeSql. While Knex supports QueryBuilder/Raw instances as positional bindings (array elements — as seen in BaseDatabase.exists()), its support for Raw instances as values in named-binding objects (:maxDepth) is undocumented and has no precedent in this codebase. If the named-binding path doesn't inline the Raw object, it would serialize as [object Object], causing a PostgreSQL type error or incorrect comparison in w.depth < :maxDepth. A safer approach is to inline the subquery SQL directly into the main query string or use a hardcoded constant.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit c1a1ec2. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants