fix(jwt): harden provider JWT caching and refresh#3228
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughCentralizes provider auth on an ensureToken callback: JWT hooks add hydration, skew and error tracking with retries; provider-proxy APIs accept ensureToken and implement per-session WebSocket rotation with a RotationTracker; server signals token expiry; UI components gate access via useProviderAccess and render ProviderAuthFallback; call sites and tests updated. ChangesJWT Token Refresh and WebSocket Rotation
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3228 +/- ##
==========================================
- Coverage 66.93% 65.72% -1.22%
==========================================
Files 1076 996 -80
Lines 26329 24403 -1926
Branches 6349 5964 -385
==========================================
- Hits 17623 16038 -1585
+ Misses 7610 7299 -311
+ Partials 1096 1066 -30
*This pull request uses carry forward flags. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@apps/deploy-web/src/components/deployments/DeploymentLeaseShell.tsx`:
- Around line 40-43: The current hasShellAccess state is sticky so the
auth-failure banner gated by "!hasShellAccess" never reappears once shell access
was granted; change the banner rendering logic to check
providerCredentials.details.error (e.g., render when
Boolean(providerCredentials.details.error)) so the alert shows whenever the
provider credentials report an error even if hasShellAccess is true, and update
the second occurrence (the block referenced at lines 245-253) the same way; keep
the useEffect that sets hasShellAccess from providerCredentials.details.usable
unchanged.
In `@apps/deploy-web/src/components/deployments/DeploymentLogs.tsx`:
- Around line 60-63: The component DeploymentLogs sets hasLogsAccess to true but
never resets it, so auth-failure alerts are hidden after an initial success;
update the useEffect that currently reads providerCredentials.details.usable to
setHasLogsAccess(providerCredentials.details.usable) (i.e., assign the boolean
value rather than only setting true), or remove the derived state and read
providerCredentials.details.usable directly where the alert is rendered (also
apply the same fix for the similar block around the code handling lines 241-249)
so the UI reflects current token usability changes instead of sticking true
forever.
In `@apps/deploy-web/src/services/provider-proxy/provider-proxy.service.ts`:
- Around line 240-285: The generator currently exits silently when
input.ensureToken() rejects or when rotation is disallowed
(tracker.allowRotation() is false), leaving consumers without a terminal `{
closed: true }` message; update the loop in provider-proxy.service.ts so that if
await input.ensureToken() throws you catch the error, yield a terminal `{
closed: true }` message (or an equivalent LogStreamMessage with closed: true),
then rethrow or return to stop the generator, and likewise before any early
return where `!rotate || !tracker.allowRotation()` occurs yield the `{ closed:
true }` terminal message; use the existing symbols (input.ensureToken,
rotationAbort, session.disconnect, tracker.allowRotation,
WebsocketSession.receive) to locate the places to add the catch-and-yield and
the pre-return yield so consumers always get a final closed event.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 23894ffc-98c2-41f6-bd0c-ac310c806aee
📒 Files selected for processing (17)
apps/deploy-web/src/components/deployments/DeploymentLeaseShell.tsxapps/deploy-web/src/components/deployments/DeploymentLogs.tsxapps/deploy-web/src/components/deployments/LeaseRow.tsxapps/deploy-web/src/components/deployments/ManifestUpdate/ManifestUpdate.spec.tsxapps/deploy-web/src/components/deployments/ManifestUpdate/ManifestUpdate.tsxapps/deploy-web/src/components/new-deployment/CreateLease/CreateLease.tsxapps/deploy-web/src/hooks/useProviderApiActions.tsxapps/deploy-web/src/hooks/useProviderCredentials/useProviderCredentials.spec.tsapps/deploy-web/src/hooks/useProviderCredentials/useProviderCredentials.tsapps/deploy-web/src/hooks/useProviderJwt/useProviderJwt.spec.tsxapps/deploy-web/src/hooks/useProviderJwt/useProviderJwt.tsapps/deploy-web/src/queries/useLeaseQuery.spec.tsxapps/deploy-web/src/queries/useLeaseQuery.tsapps/deploy-web/src/services/provider-proxy/provider-proxy.service.spec.tsapps/deploy-web/src/services/provider-proxy/provider-proxy.service.tsapps/provider-proxy/src/services/WebsocketServer.tsapps/provider-proxy/test/functional/provider-proxy-ws.spec.ts
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/deploy-web/src/services/provider-proxy/provider-proxy.service.ts (1)
50-61:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRefresh the JWT inside the manifest retry loop.
ensureToken()is resolved once before the loop, so attempts 2/3 reuse the same JWT after the retry backoff. If that token expires between attempts, this path keeps resending stale auth instead of actually refreshing.Suggested fix
const serializedManifest = manifestToSortedJSON(manifest); - const credentials: ProviderCredentials = { type: "jwt", value: await options.ensureToken() }; let response: AxiosResponse | undefined; for (let i = 1; i <= 3 && !response; i++) { this.logger.info({ event: "ATTEMPT_SEND_MANIFEST", attempt: i, providerAddress: providerInfo.owner, dseq: options.dseq }); try { if (!response) { + const credentials: ProviderCredentials = { type: "jwt", value: await options.ensureToken() }; response = await this.request(`/deployment/${options.dseq}/manifest`, { method: "PUT", credentials, body: serializedManifest, timeout: 60_000,🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/deploy-web/src/services/provider-proxy/provider-proxy.service.ts` around lines 50 - 61, The current code captures the JWT once (via options.ensureToken()) before the retry loop so subsequent attempts reuse a potentially-expired token; move the await options.ensureToken() call into the retry loop and build a fresh ProviderCredentials object (type "jwt", value: await options.ensureToken()) immediately before calling this.request(`/deployment/${options.dseq}/manifest`, ...) so each attempt uses a newly-fetched token; update references around manifestToSortedJSON, ProviderCredentials, response and the for-loop to ensure credentials is re-created per iteration.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@apps/deploy-web/src/components/deployments/ProviderAuthGate.tsx`:
- Around line 17-21: The hook currently short-circuits to return fallback: null
whenever hasAccess is true, which hides later errors in
providerCredentials.details.error; update the ProviderAuthGate hook so it always
supplies a fallback component when providerCredentials.details.error exists
(even if hasAccess is true) — keep the hasAccess boolean for gating behavior but
set fallback to <ProviderAuthFallback error={providerCredentials.details.error}
/> when providerCredentials.details.error is truthy; locate the hasAccess return
branch in ProviderAuthGate and change the fallback logic to prefer the
error-based ProviderAuthFallback over null, ensuring ProviderAuthFallback,
hasAccess, and providerCredentials.details.error are used as the deciding
symbols.
---
Outside diff comments:
In `@apps/deploy-web/src/services/provider-proxy/provider-proxy.service.ts`:
- Around line 50-61: The current code captures the JWT once (via
options.ensureToken()) before the retry loop so subsequent attempts reuse a
potentially-expired token; move the await options.ensureToken() call into the
retry loop and build a fresh ProviderCredentials object (type "jwt", value:
await options.ensureToken()) immediately before calling
this.request(`/deployment/${options.dseq}/manifest`, ...) so each attempt uses a
newly-fetched token; update references around manifestToSortedJSON,
ProviderCredentials, response and the for-loop to ensure credentials is
re-created per iteration.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 4393bd0f-9248-485c-9875-140eb8936af9
📒 Files selected for processing (9)
apps/deploy-web/src/components/deployments/DeploymentLeaseShell.tsxapps/deploy-web/src/components/deployments/DeploymentLogs.tsxapps/deploy-web/src/components/deployments/ProviderAuthGate.tsxapps/deploy-web/src/components/new-deployment/CreateLease/CreateLease.spec.tsxapps/deploy-web/src/hooks/useProviderCredentials/useProviderCredentials.tsapps/deploy-web/src/hooks/useProviderJwt/useProviderJwt.tsapps/deploy-web/src/services/provider-proxy/provider-proxy.service.spec.tsapps/deploy-web/src/services/provider-proxy/provider-proxy.service.tsapps/provider-proxy/test/functional/provider-proxy-ws.spec.ts
💤 Files with no reviewable changes (1)
- apps/deploy-web/src/hooks/useProviderJwt/useProviderJwt.ts
🚧 Files skipped from review as they are similar to previous changes (3)
- apps/provider-proxy/test/functional/provider-proxy-ws.spec.ts
- apps/deploy-web/src/hooks/useProviderCredentials/useProviderCredentials.ts
- apps/deploy-web/src/services/provider-proxy/provider-proxy.service.spec.ts
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@apps/deploy-web/src/services/provider-proxy/RotatingShellSession.ts`:
- Around line 37-39: openNextSession currently calls this.options.ensureToken()
before checking teardown state, which can trigger token work even after
cancellation/disconnect; change the logic in openNextSession to first check
this.isDisconnected and this.options.signal?.aborted and return early if set,
then call this.options.ensureToken(), so token generation is skipped for
torn-down sessions (refer to method openNextSession and the ensureToken call and
guards isDisconnected/options.signal?.aborted).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 94ebbd20-e825-4740-830b-6094201c9fde
📒 Files selected for processing (5)
apps/deploy-web/src/components/deployments/ProviderAuthGate.tsxapps/deploy-web/src/services/provider-proxy/RotatingShellSession.tsapps/deploy-web/src/services/provider-proxy/provider-proxy.service.spec.tsapps/deploy-web/src/services/provider-proxy/provider-proxy.service.tsapps/deploy-web/src/services/provider-proxy/ws-rotation.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- apps/deploy-web/src/components/deployments/ProviderAuthGate.tsx
- apps/deploy-web/src/services/provider-proxy/provider-proxy.service.ts
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@apps/deploy-web/src/hooks/useProviderAccess/useProviderAccess.spec.ts`:
- Around line 40-46: The test helper buildCredentials currently creates a nested
mock by passing details: mock<UseProviderCredentialsResult["details"]>({...})
which yields proxy-shaped nested objects; instead, instantiate a root mock of
UseProviderCredentialsResult (via mock<UseProviderCredentialsResult>()) and then
assign its details property to a plain object with usable and error fields (not
a mock) so assertions inspect a normal POJO; update buildCredentials to return
that root mock with details replaced by the plain object.
In `@apps/deploy-web/src/hooks/useProviderAccess/useProviderAccess.ts`:
- Around line 6-10: The hook initializes hasAccess to false causing a wrong
first render; change the useState initialization in useProviderAccess (the
hasAccess/setHasAccess state) to seed from the current
providerCredentials.details.usable value instead of always false so initial
render reflects current credentials; keep the existing useEffect that updates
hasAccess on providerCredentials.details.usable changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 39867adb-8061-4270-9b61-f7193b0f299a
📒 Files selected for processing (6)
apps/deploy-web/src/components/deployments/DeploymentLeaseShell.tsxapps/deploy-web/src/components/deployments/DeploymentLogs.tsxapps/deploy-web/src/components/deployments/ProviderAuthGate.spec.tsxapps/deploy-web/src/components/deployments/ProviderAuthGate.tsxapps/deploy-web/src/hooks/useProviderAccess/useProviderAccess.spec.tsapps/deploy-web/src/hooks/useProviderAccess/useProviderAccess.ts
💤 Files with no reviewable changes (1)
- apps/deploy-web/src/components/deployments/ProviderAuthGate.tsx
🚧 Files skipped from review as they are similar to previous changes (2)
- apps/deploy-web/src/components/deployments/DeploymentLogs.tsx
- apps/deploy-web/src/components/deployments/DeploymentLeaseShell.tsx
- Hydrate JWT atom from storage before auto-gen fires, so a hard refresh
with a valid cached token no longer triggers a fresh POST.
- Make `isTokenExpired` skew-aware so we refresh 60s before exp.
- Wrap `ensureToken` in cockatiel retry, surface a toast via notificator
and populate `details.error` once the retries are exhausted.
- Dedup in-flight token generation at module scope so multiple hook
instances share one POST instead of racing on mount.
- Emit a structured `{ error: "tokenExpired" }` payload from the proxy's
schema-validation path so the client matches a stable contract.
- Add a WS rotation loop in `provider-proxy.service` that swaps the
underlying session on `tokenExpired`, capped at 3 rotations in 60s,
emitting a synthetic close when the cap or auth-gen fails so the UI
surfaces the existing "unavailable" panel.
- Render a skeleton while credentials load and a warning alert when
auth-gen ultimately fails, so the shell/logs panes no longer stay
blank for ~3.5s.
- Refresh the token per-attempt inside `sendManifest`'s internal retry
loop and inside `useLeaseStatus`, since a long-running flow would
otherwise stale-token-401 the provider.
- Extend the managed JWT scope to include `send-manifest` /
`get-manifest` (previously missing, blocking PUT /manifest with 401).
- Extract useProviderAuthGate hook to dedupe sticky-access state and skeleton/Alert fallback shared by DeploymentLeaseShell and DeploymentLogs - Drop dead credentials? field from SendManifestToProviderOptions and hoist ensureToken() out of the retry loop - Cap RotatingShellSession outbound queue, attach .catch() to sessionReady to suppress unhandled rejections - Always disconnect WebsocketSession in finally so consumer break-out does not leak the underlying socket - Remove redundant setIsHydrated(false) toggle and stale narrating comment - Fix CreateLease.spec.tsx assertions that were left referencing the old credentials shape - Use Buffer "base64url" encoding in provider-proxy-ws test helper
Extract RotatingShellSession and its supporting ws-rotation primitives (RotationTracker, mergeSignals, isTokenExpiredMessage) into sibling files. provider-proxy.service.ts re-exports ReceivedShellMessage so existing consumer import paths are unchanged.
…xhaustion
- ProviderAuthGate now renders the warning alert whenever the credentials
error is set, even after hasAccess has become true, so token-refresh
exhaustion mid-session is visible.
- getLogsStream yields { closed: true } when ensureToken rejects or when
the rotation cap is hit, so consumers see a terminal state instead of
silently stalling.
Replace useProviderAuthGate (hook returning JSX) with useProviderAccess (sticky-access boolean) and ProviderAuthFallback (presentational component). Cleaner separation between state and rendering.
Guard openNextSession against running token generation work after the session was disconnected or its abort signal fired before the call.
Extracts the sticky provider-access hook out of ProviderAuthGate.tsx into hooks/useProviderAccess and covers it with renderHook tests. Adds tests for ProviderAuthFallback covering the no-fallback, skeleton, error and sticky-then-error branches.
Initialize the sticky access state from providerCredentials.details.usable so the first render reports the correct value when credentials are already usable. Update the spec helper to build credentials.details as a plain object instead of a nested mock proxy.
dcf3dbd to
13b9c68
Compare
Make ensureToken a ref-backed stable callback so long-lived consumers (RotatingShellSession, getLogsStream) don't capture stale token state and DeploymentLeaseShell can include it in deps without the shell session tearing down on JWT rotation.
Replaces the module-level mutable inFlightTracker with a jotai atom read via getDefaultStore(), matching the JWT_TOKEN_ATOM pattern in useProviderJwt. Same dedup semantics — concurrent ensureToken calls from multiple hook instances share one in-flight generateToken promise.
Follow the src/store/*Store.ts pattern (camelCase atoms grouped in a default-exported namespace, e.g. walletStore, sdlStore) instead of declaring the atom inline in the hook.
…atom Match the convention used by walletStore/sdlStore consumers.
The hook now uses useSetAtom on dedicated write atoms (claim/release/ clearForOtherAddress) that perform atomic check-and-set against the in-flight atom via jotai's get/set callbacks — no more raw imperative store access from inside the hook.
Exercise claim/release/clear-for-other-address via a fresh jotai store per test, asserting state transitions and dedup semantics directly.
The CA-authorized sync branch in createProviderSocket emitted "verified"
inside the "upgrade" handler, while the ws was still in CONNECTING. Any
once("verified", ...) listener that called ws.send then threw
"WebSocket is not open: readyState 0 (CONNECTING)". The new
RotatingShellSession + WS-rotation flow on this branch opens provider
sockets more often and made the race observable. The async cert-
validation branch already checks readyState before emitting; the sync
branch now goes through the same guard via emitVerifiedWhenOpen.
useProviderList refetches on window focus by default, returning a new providers array with new object identities. The shellSession useMemo was depending on the whole providerInfo/selectedLease objects, so the benign refetch tore down and rebuilt the shell session and re-sent the init Uint8Array, reprinting the "/app $" prompt. Now depending on the primitive fields actually consumed by connectToShell.
DeploymentLogs had two sources of identity churn that restarted the log stream on every tab refocus: - selectedLease (object) in the stream useEffect deps - selectedServices replaced with a new array on every leaseStatus refetch, even when the keys were identical Switching the deps to primitive ids (dseq/gseq/oseq, providerHostUri, providerAddress) and short-circuiting setSelectedServices when the new key list deep-equals the previous one keeps the stream attached.
Replace useState+useEffect sticky latch in useProviderAccess with a render-time useRef mutation, avoiding the extra render on the unusable→usable flip. In RotationTracker.allowRotation, check the cap before pushing so refused rotations don't leave a stale timestamp in the window buffer.
Cover emitVerifiedWhenOpen (both OPEN and CONNECTING branches) and isJwtExpiredError (match + non-match paths) directly, lifting codecov patch coverage above the 80% threshold.
Why
Fixes CON-371. PR #3217 (CON-268) switched managed
deploy-webto a JWT-only provider auth flow with a cache-firstensureToken()(jotai atom + localStorage, 30 min TTL). The cache works on the happy path but several edge cases reduced its value or stranded users:expdied almost immediately.POST /v1/create-jwt-token.send-manifest/get-manifest, so PUT /manifest returned 401 in production paths after CON-268 — caught during smoke and bundled in.sendManifest's internal retry loop snapshotted the token at call time; long flows (lease-not-found backoff) could outlive the cached token.What
Client (
apps/deploy-web):useProviderJwt: hydrated flag + skew-awareisTokenExpired(REFRESH_SKEW_SECONDS = 60), JWT scope extended to includesend-manifest/get-manifest.useProviderCredentials: cockatiel retry insideensureToken, toast on exhaustion viauseNotificator,details.errorexposed, in-flight tracker promoted to module scope so all hook instances dedup against the same POST. Cleared onaddresschange.provider-proxy.service.ts: WS rotation loop ingetLogsStreamand a newRotatingShellSessionthat swap the underlying WebsocketSession ontokenExpired, capped at 3 rotations within 60s. Per-session inner AbortController so cockatiel doesn't retry on intentional rotation disconnects. Emits a syntheticclosed: truewhen the cap is exceeded or auth-gen fails, so the existing "unavailable" panel surfaces.sendManifestcallsensureTokenper attempt inside its internal retry loop.DeploymentLeaseShell/DeploymentLogs: stickyhasShellAccess/hasLogsAccessstate so the panes survive transientusableflips during refresh; skeleton-loading placeholder while credentials load; warning alert when auth-gen ultimately fails.useLeaseStatus,ManifestUpdate,LeaseRow,CreateLease,useProviderApiActions: switched from snapshotproviderCredentials.detailstoensureTokenflow where the request is auth-bearing.Server (
apps/provider-proxy):WebsocketServer: when the schema-validation failure is specifically "Token has expired", emit{ type: "websocket", error: "tokenExpired" }(stable contract) instead of the genericInvalid message formatshape. Other schema failures unchanged.Specs:
useProviderJwt.spec.tsx,useProviderCredentials.spec.ts,provider-proxy.service.spec.ts,useLeaseQuery.spec.tsx,ManifestUpdate.spec.tsxupdated for the new contract and new behaviors. Added functional case inapps/provider-proxy/test/functional/provider-proxy-ws.spec.tsasserting the structuredtokenExpiredpayload.Verification
npm run lint -- --quietclean inapps/deploy-webandapps/provider-proxy.npx tsc --noEmitdelta is 0 (17 pre-existingManifestUpdate.spec.tsxerrors unrelated to this PR).npm run test:unitgreen in both apps.POST /v1/create-jwt-tokenon hard refresh while the cached token is in-window.{ error: "tokenExpired" }frame, a new WS handshake fires with a freshAuthorization: Bearer …, the XTerm stays mounted (no remount on refresh)./v1/create-jwt-tokenblocked → after the auth-gen retries exhaust, the synthetic close yields the "Shell access unavailable" alert./v1/create-jwt-tokenblocked → 4 attempts (1 + 3 retries, ~3.5s), error toast appears, no infinite retry; skeleton shows during the retry window so the pane isn't blank.Summary by CodeRabbit
New Features
Bug Fixes