Skip to content

fix: skip root-owned jiti cache in deploy rsync (exit 23)#3788

Merged
ryota-murakami merged 1 commit into
mainfrom
fix/deploy-rsync-skip-jiti-cache
May 31, 2026
Merged

fix: skip root-owned jiti cache in deploy rsync (exit 23)#3788
ryota-murakami merged 1 commit into
mainfrom
fix/deploy-rsync-skip-jiti-cache

Conversation

@ryota-murakami

@ryota-murakami ryota-murakami commented May 31, 2026

Copy link
Copy Markdown
Collaborator

Problem

The push-to-main deploy aborts at the artifact-sync rsync with exit 23, before pm2 restart — so the new bundle is synced to disk and prisma migrate deploy runs, but PM2 keeps serving the old code (same "silent failure" shape as #3787, different cause).

Runtime error from the failed run (Extract and restart server step):

rsync: [generator] delete_file: unlink(node_modules/.cache/jiti/nsx-prisma.config.88e0c7e4.mjs) failed: Permission denied (13)
cannot delete non-empty directory: node_modules/.cache/jiti
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1347) [sender=3.2.7]

Root cause

node_modules/.cache/jiti/ is jiti's compiled cache of prisma.config.ts (Prisma 7 loads its .ts config through jiti, which writes content-hashed output there). On the production host it is owned by root:

drwxr-xr-x 3 root root  node_modules/.cache
drwxr-xr-x 2 root root  node_modules/.cache/jiti
-rw-r--r-- 1 root root  node_modules/.cache/jiti/nsx-prisma.config.88e0c7e4.mjs

A manual pnpm deploy (scripts/deploy, which SSHes in as root per the maintainer's ssh config) runs prisma migrate and creates the cache root-owned. The GitHub Actions deploy rsync runs as a non-root user — it cannot unlink a root-owned file inside a root-owned 0755 directory → --delete fails → exit 23.

Fix

Add an anchored --exclude='/node_modules/.cache/' to all three deploy rsyncs (artifact sync + both rollback rsyncs):

rsync -a --delete --exclude='/.env' --exclude='/logs/' --exclude='/run/' --exclude='/node_modules/.cache/' ...

--delete now never touches the jiti cache — a regenerable runtime artifact that has no business being pruned by a deploy — so ownership is irrelevant. Anchored with a leading / so it matches only the top-level node_modules/.cache, not deep node_modules/**/.cache dirs (same anchoring discipline as the /logs/ fix in #3787).

Verification (production host, rsync 3.2.7)

  • find /home/deploy/nsx/node_modules -uid 0 returns only .cache and its contents — zero other root-owned paths. So this single exclude fully unblocks the deploy; there is no second root-owned path waiting to fail the next run.
  • This run also empirically confirmed the fix(deploy): anchor rsync excludes to transfer root to prevent exit 23 #3787 anchor fix: there were no @sentry "cannot delete" errors this time — the deploy got all the way past the nested-node_modules pruning and stopped only at the .cache wall.

Production state (no regression)

The failed deploy aborted at the rsync, before pm2 restart and before deployment_applied=true, so it neither restarted PM2 nor rolled back. Production is healthy: post_list → HTTP 200, PM2 server online with ~16h uptime (= untouched by the failed deploy). This PR makes the next deploy complete through pm2 restart.

Notes

  • Code-only. Production is untouched by this change.
  • Latent follow-up (not required here): the root-owned .cache/jiti stays on disk. While prisma.config.ts is unchanged (hash 88e0c7e4), jiti only reads the world-readable cache — fine. If prisma.config.ts ever changes, jiti must write a new hash file into the root-owned dir as the non-root deploy user, which may fail. A one-time rm -rf node_modules/.cache on the host (as root) would clear that, but it is a production mutation and out of scope for this CI-only fix.
  • scripts/deploy (manual path) is unaffected — its rsyncs use neither --delete nor these excludes and never sync node_modules.

Risk

Low — CI-workflow change only.

Summary by CodeRabbit

  • Bug Fixes
    • Improved deployment reliability by resolving cache directory handling issues during rollback and synchronization operations.

The push-to-main deploy aborted at the artifact-sync rsync with exit 23
(`unlink(node_modules/.cache/jiti/nsx-prisma.config.*.mjs) failed:
Permission denied (13)` -> `cannot delete non-empty directory:
node_modules/.cache/jiti`), before the pm2 restart -- so the new bundle
landed on disk and `prisma migrate deploy` ran, but PM2 kept serving the
old code.

Root cause: `node_modules/.cache/jiti/` (jiti's compiled cache of
prisma.config.ts) is owned by root. A manual `pnpm deploy` (which SSHes
as root) ran `prisma migrate` and created it root-owned. The GitHub
Actions deploy rsync runs as a non-root user and cannot `--delete` a
root-owned file inside a root-owned 0755 dir -> exit 23.

Add anchored `--exclude='/node_modules/.cache/'` to all three deploy
rsyncs (artifact sync + both rollback rsyncs) so --delete never touches
the regenerable runtime cache regardless of its owner. Anchored with a
leading '/' so it matches only the top-level path, not deep node_modules
.cache dirs -- same fix shape as the prior /logs/ anchor fix (#3787).

Verified on the production host: `find node_modules -uid 0` returns only
.cache and its contents -- no other root-owned path -- so the exclude
fully unblocks the deploy. Code-only; production is untouched.
@coderabbitai

coderabbitai Bot commented May 31, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

The deployment workflow is updated to exclude node_modules/.cache/ from three separate rsync operations to prevent rsync exit 23 errors. This directory contains root-owned Jiti/prisma cache files that non-root rsync users cannot delete. The exclusion is applied in rollback deployment, rollback artifact capture, and main staging-to-production sync steps with updated comments.

Changes

rsync Cache Directory Exclusion

Layer / File(s) Summary
rsync excludes for node_modules cache in deploy/rollback paths
.github/workflows/deploy.yml
The rsync exclude lists in the rollback deployment step, rollback artifact capture step, and main staging-to-production sync step are all extended with '/node_modules/.cache/' to preserve the cache directory and avoid exit 23 failures. Surrounding comments are updated to document that root-owned cache directories cannot be deleted by non-root rsync users.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

  • laststance/nsx#3787: Both PRs modify the same GitHub deploy workflow's rsync exclude patterns (including anchored /-prefixed excludes and preserving cache-related directories) in the rollback capture and staging→production sync steps to address rsync exit-23 issues.
  • laststance/nsx#3777: Both PRs modify the same .github/workflows/deploy.yml deploy/rollback artifact tarball or sync behavior (extra rsync excludes vs changing archive creation with tar --hard-dereference), so they overlap in the workflow's artifact handling logic.
  • laststance/nsx#3776: Both PRs modify .github/workflows/deploy.yml to change how deployment/rollback artifacts are prepared and synchronized—main PR adds an rsync exclude to preserve node_modules/.cache, while retrieved PR refactors staging-directory cleanup/reset logic to avoid unsafe deletions before artifact sync.

Poem

🐰 A cache to keep, a sync to spare,
Three paths now exclude with utmost care,
No exit twenty-three shall block the way,
When root-owned files decide to stay!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately identifies the main fix: adding rsync excludes for a root-owned jiti cache to prevent exit 23 errors during deployment.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/deploy-rsync-skip-jiti-cache

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
.github/workflows/deploy.yml (1)

347-349: Consider eliminating the root-owned cache to avoid relying on the exclude indefinitely.

The exclude correctly unblocks the deploy, but the underlying root-owned node_modules/.cache/ persists. Since deploy-time run_prisma_cli (Line 270) runs as the non-root Actions user against a root-owned 0755 directory, jiti can't write new cache entries if prisma.config.ts changes (it'll silently fall back to no-cache). A one-time sudo chown -R "$SSH_USERNAME" "$remote_dir/node_modules/.cache" on the host (or removing the directory so it regenerates under the correct owner) would restore writable caching and remove the latent ownership skew, while keeping the exclude as a safety net.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/deploy.yml around lines 347 - 349, Add a step that removes
or re-owns the root-owned cache on the remote host before syncing so the
non-root Actions user can write jiti/prisma cache: SSH to the remote and run
either a one-time sudo chown -R "$SSH_USERNAME"
"$remote_dir/node_modules/.cache" or rm -rf "$remote_dir/node_modules/.cache"
(so it regenerates owned by the Actions user) prior to the rsync/--delete step
(the line that currently calls rsync ... || fail_deployment). Keep the existing
--exclude='/node_modules/.cache/' as a safety net after fixing ownership.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In @.github/workflows/deploy.yml:
- Around line 347-349: Add a step that removes or re-owns the root-owned cache
on the remote host before syncing so the non-root Actions user can write
jiti/prisma cache: SSH to the remote and run either a one-time sudo chown -R
"$SSH_USERNAME" "$remote_dir/node_modules/.cache" or rm -rf
"$remote_dir/node_modules/.cache" (so it regenerates owned by the Actions user)
prior to the rsync/--delete step (the line that currently calls rsync ... ||
fail_deployment). Keep the existing --exclude='/node_modules/.cache/' as a
safety net after fixing ownership.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 5907c343-e6cb-4cc1-9d0b-61d9cdd45778

📥 Commits

Reviewing files that changed from the base of the PR and between f2b33bc and 97d1c69.

📒 Files selected for processing (1)
  • .github/workflows/deploy.yml

@ryota-murakami ryota-murakami merged commit d71a107 into main May 31, 2026
8 checks passed
@ryota-murakami ryota-murakami deleted the fix/deploy-rsync-skip-jiti-cache branch May 31, 2026 08:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant