fix: harden OpenClaw Podman runtime flow by shadowfax92 · Pull Request #770 · browseros-ai/BrowserOS

Nikhil (shadowfax92) · 2026-04-20T22:45:56Z

Summary

give BrowserOS a dedicated named Podman machine and dynamic host-port handling for the OpenClaw gateway
persist small runtime state, make OpenClawService runtime-aware, and add repair/reset recovery flows with restart auto-escalation on strong machine failures
simplify the Agents page into one operator/runtime card and expose repair/reset controls through the API and client hooks

Design

This change keeps BrowserOS in control of a dedicated OpenClaw runtime without exposing container or VM concepts in the UI. The server now owns named-machine lifecycle, discovers and retries host-port binding safely, persists the chosen host port plus repair metadata, and uses runtime inspection/status to drive service recovery. The agent UI collapses raw runtime/control-plane details into a simpler operator state with explicit repair/reset actions while preserving setup, agent management, and Podman override flows.

Test plan

bun test apps/server/tests/api/services/openclaw/openclaw-runtime-state.test.ts apps/server/tests/api/services/openclaw/podman-runtime.test.ts apps/server/tests/api/services/openclaw/container-runtime.test.ts apps/server/tests/api/services/openclaw/openclaw-service.test.ts apps/server/tests/api/routes/openclaw.test.ts
bun test apps/agent/entrypoints/app/agents/openclaw-operator-state.test.ts
bun run --filter @browseros/server typecheck
bun run --filter @browseros/agent typecheck

Nikhil (shadowfax92) · 2026-04-20T22:46:07Z

Claude (@claude) review

Nikhil (shadowfax92) · 2026-04-20T22:46:07Z

Greptile (@greptileai) review

claude · 2026-04-20T22:46:27Z

Claude finished Nikhil (@shadowfax92)'s task in 2m 57s —— View job

Code Review: fix/harden OpenClaw Podman runtime flow

Gather context and understand the PR
Read core service changes (openclaw-service.ts, container-runtime.ts, podman-runtime.ts)
Read new state/runtime files (openclaw-runtime-state.ts, openclaw-operator-state.ts)
Read API route changes (openclaw.ts)
Read UI changes (AgentsPage.tsx, useOpenClaw.ts)
Read test files
Compile and post review

Overall this is a solid, well-structured hardening pass. The separation between the new openclaw-runtime-state.ts (Zod-validated persistence) and the operator-state mapping, the retry logic for port binding conflicts, and the machine-name isolation are all good design decisions. The tests are thorough. A few issues worth addressing before merge:

Bug: `hasFailureState()` includes `'recovering'`, shadowing `resolveStatus()`

openclaw-service.ts:1128-1135 — 'recovering' is included in hasFailureState():

private hasFailureState(): boolean {
  return (
    this.controlPlaneStatus === 'failed' ||
    this.controlPlaneStatus === 'recovering' ||  // ← this is the problem
    ...
  )
}

resolveStatus() checks hasFailureState() first (line 1087), so any call to getStatus() during a repair will return status: 'error' instead of status: 'starting'. But lower in resolveStatus() (lines 1099–1106), 'recovering' is explicitly handled as 'starting' — code that can never be reached:

if (
  this.controlPlaneStatus === 'connecting' ||
  this.controlPlaneStatus === 'reconnecting' ||
  this.controlPlaneStatus === 'recovering'   // ← dead code
) {
  return 'starting'
}

As a result, getOpenClawOperatorState() will return 'needs-attention' during repair instead of 'starting', and the copy in getOperatorCardCopy() that reads status?.controlPlaneStatus === 'recovering' (line ~171 of AgentsPage.tsx) will never render. Remove 'recovering' from hasFailureState().

Fix this →

Smell: `inspectGateway()` duck-types its own `runtime` field

openclaw-service.ts:1032-1048:

private async inspectGateway(): Promise<GatewayInspection | null> {
  const runtime = this.runtime as Partial<ContainerRuntime> & {
    inspectGateway?: () => Promise<GatewayInspection>
  }
  if (typeof runtime.inspectGateway !== 'function') {
    return null
  }
  ...
}

ContainerRuntime already has inspectGateway() as a public method (line 169 of container-runtime.ts), so this optional-chained cast is unnecessary in production. The only reason this exists is to allow test stubs that omit inspectGateway. A cleaner solution is to either call this.runtime.inspectGateway() directly and wrap in try/catch, or define an interface with optional inspectGateway. The current approach is fragile — a test stub that accidentally provides a non-function inspectGateway property would silently return null rather than throwing.

Design: `PodmanOverridesCard` renders twice when operator needs attention

AgentsPage.tsx:733-738 and :809 — When operatorState is 'setup-needed' or 'needs-attention', the inline PodmanOverridesCard appears inside the operator card AND the standalone collapsible card is always rendered at the bottom. Both are live and writable at the same time. The standalone version starts collapsed, but there are now two independent inputs for the same server-side value — confusing if a user types in one and saves/clears via the other.

Consider only showing the inline variant when the operator is in a degraded/uninitialized state, and hiding the standalone card entirely in those states. Fix this →

Unused export: `reconnectOpenClaw` / `reconnecting` in `useOpenClawMutations`

useOpenClaw.ts:235-242 and :268-269 — reconnectOpenClaw and reconnecting are returned from the hook but do not appear anywhere in AgentsPage.tsx. If the reconnect action was intentionally removed from the UI, clean up the returned values. If it's kept for future use, add a comment. Dead exports degrade discoverability and inflate the hook's surface area.

Minor: sync fs calls in async route validation

openclaw.ts:51-62 — existsSync, statSync, and accessSync are synchronous in the getPodmanOverrideValidationError validator, which is called from an async Hono handler. They briefly block the event loop. In a local server with low concurrency this is unlikely to matter, but it's inconsistent with the async patterns used elsewhere in the file. Consider fs.promises.access + fs.promises.stat.

Minor: `resetRuntime()` applies `classifyControlPlaneError` to non-gateway errors

openclaw-service.ts:455-461 — When stopGateway() or stopMachineIfSafe() throws inside resetRuntime(), classifyControlPlaneError is called and lastRecoveryReason is set to something like 'unknown'. But lastRecoveryReason semantically represents gateway/control-plane connection errors, not machine lifecycle errors. Consider leaving lastRecoveryReason as-is (or null) on reset failure and just setting lastError.

Minor: Redundant container removal on last `startGateway` attempt

container-runtime.ts:140-148 — On the final retry of a bind-conflict, removeGatewayContainer() is called and then the loop throws. The removal before the throw is harmless but unnecessary (the container failed to start, so there's nothing to remove). Worth cleaning up to remove the ambiguity about whether the cleanup is meaningful.

Nit: Dialog's `newName` is not cleared on cancel

AgentsPage.tsx:495-514 — setNewName('') is only called after a successful createAgent call. If the dialog is dismissed via the X or onOpenChange, the typed name persists on next open (the useEffect that sets 'agent' only fires if the current value is falsy). This may be intentional (preserves partial input), but if so it should at least be consistent: the setupProviderId and createProviderId don't reset either. Consider aligning the UX either way.

Positive callouts

Zod schema in openclaw-runtime-state.ts with safeParse + strict() is clean and future-proof.
chooseGatewayHostPort() with allocateDistinctEphemeralPort() handles port conflicts gracefully. The Set<number> tracking across attempts avoids allocating the same ephemeral port twice.
stopMachineIfSafe() is thoughtful — checking for non-BrowserOS containers before stopping the machine avoids disrupting user workloads.
Operator state abstraction (getOpenClawOperatorState) cleanly decouples the server's fine-grained status model from the UI's coarser needs-attention / healthy / starting states. The test coverage of edge cases is solid.
isStrongMachineCorruptionSignature() auto-escalation from restart → repair is a good reliability improvement.

github-actions · 2026-04-20T22:47:54Z

❌ Tests failed — 4/830 failed

Suite	Passed	Failed	Skipped
✅ `agent-sdk`	44/44	0	0
✅ `agent`	25/25	0	0
✅ `build`	7/7	0	0
✅ `eval`	8/8	0	0
✅ `server-agent`	261/261	0	0
❌ `server-api`	160/162	2	0
✅ `server-browser`	3/3	0	0
✅ `server-integration`	9/9	0	0
✅ `server-root`	35/38	0	3
✅ `server-sdk`	11/11	0	0
✅ `server-skills`	31/31	0	0
❌ `server-tools`	229/231	2	0

Failed tests

server-api — ContainerRuntime > startGateway retries with a different host port when podman reports a bind conflict
server-api — ContainerRuntime > startGateway cleans up the managed container after an exhausted bind-conflict retry sequence
server-tools — get_dom > scopes to a nested CSS selector
server-tools — search_dom > returns element attributes in search results

View workflow run

greptile-apps · 2026-04-20T22:50:51Z

Greptile Summary

This PR hardens the OpenClaw Podman runtime by introducing a dedicated named machine (browseros-openclaw), dynamic host-port selection with retry, persisted runtime state (runtime-state.json), and explicit repair/reset recovery flows with auto-escalation on machine corruption. The agent UI is simplified into a single operator card with repair/reset controls surfaced through new API routes and client hooks.

Two P1 issues need attention before merging:

hasFailureState() includes controlPlaneStatus === 'recovering', so resolveStatus() always returns 'error' while repair is in progress instead of 'starting' — the client shows an error card and allows re-triggering repair during an active repair cycle.
resetRuntime() calls stopGateway() without a try/catch guard; when the Podman machine is already down (the exact scenario that triggers reset), the stop fails and the reset never clears state, leaving the service permanently stuck.

Confidence Score: 3/5

Two P1 bugs in the recovery path should be fixed before merging.

The two P1 issues both affect the exact failure scenarios this PR is designed to harden: hasFailureState masks repair progress as an error state, and resetRuntime silently fails when the Podman machine is already down — the condition that most warrants a reset. The rest of the implementation (named machine, dynamic port, state persistence, UI simplification) is solid.

openclaw-service.ts — both P1 bugs live here (hasFailureState and resetRuntime).

Important Files Changed

Filename	Overview
packages/browseros-agent/apps/server/src/api/services/openclaw/openclaw-service.ts	Core orchestrator — adds repair/reset/runtime-state persistence; has two P1 bugs: `hasFailureState` masks repair-in-progress as error, and `resetRuntime` doesn't guard `stopGateway` failures.
packages/browseros-agent/apps/server/src/api/services/openclaw/container-runtime.ts	New dynamic host-port selection and gateway inspection; retry logic is functionally correct but has redundant conditional structure in the bind-conflict loop.
packages/browseros-agent/apps/server/src/api/services/openclaw/podman-runtime.ts	Adds named Podman machine (`browseros-openclaw`) and machine lifecycle helpers; clean implementation with proper Linux no-op guards.
packages/browseros-agent/apps/server/src/api/services/openclaw/openclaw-runtime-state.ts	New file: persists runtime state (host port, repair generation, last repair outcome) with Zod schema validation; straightforward and well-guarded.
packages/browseros-agent/apps/server/src/api/routes/openclaw.ts	Adds `/repair` and `/reset` endpoints delegating to service; consistent error handling pattern throughout.
packages/browseros-agent/apps/agent/entrypoints/app/agents/openclaw-operator-state.ts	Derives UI operator state from status response; maps `recovering` to `starting` correctly — but the P1 in `hasFailureState` means `status.status` is `'error'` during repair, so this mapping is never triggered.
packages/browseros-agent/apps/agent/entrypoints/app/agents/useOpenClaw.ts	Adds `repairMutation` and `resetMutation` client hooks; correct query invalidation and pending state tracking.
packages/browseros-agent/apps/agent/entrypoints/app/agents/AgentsPage.tsx	Consolidates runtime card UI with repair/reset controls and reset confirmation dialog; straightforward React refactor.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[tryAutoStart / setup / start / restart] --> B{Podman available?}
    B -- No --> Z[return uninitialized]
    B -- Yes --> C[PodmanRuntime.ensureReady]
    C --> D{machine initialized?}
    D -- No --> E[initMachine]
    E --> F[startMachine]
    D -- Yes --> G{machine running?}
    G -- No --> F
    G -- Yes --> H[ContainerRuntime.startGateway]
    F --> H
    H --> I{preferred port free?}
    I -- Yes --> J[podman run -p preferredPort]
    I -- No --> K[allocateDistinctEphemeralPort]
    K --> J
    J --> L{exit 0?}
    L -- No, bind conflict < 3 attempts --> K
    L -- No, other error --> ERR[throw]
    L -- Yes --> M[applyGatewayPort / waitForReady]
    M --> N{ready?}
    N -- No --> ERR
    N -- Yes --> O[runControlPlaneCall probe]
    O --> P[recordSuccessfulGatewayStart - save runtime-state.json]
    P --> Q[status: running / connected]
    Q -- restart error + machine corruption --> R[repairRuntime]
    R --> R1[stopGateway best-effort]
    R1 --> R2[stopMachineIfSafe best-effort]
    R2 --> R3[ensureReady]
    R3 --> R4[startGateway]
    R4 --> R5[save repairGeneration++]
    Q -- UI reset --> S[resetRuntime]
    S --> S1[stopGateway NO try/catch]
    S1 -- throws if machine down --> ERR2[reset fails - state unchanged]
    S1 -- ok --> S2[clearState + save defaultRuntimeState]

Comments Outside Diff (1)

packages/browseros-agent/apps/server/src/api/services/openclaw/openclaw-service.ts, line 219-238 (link)

allowedOrigins config may reference a stale port after setup

In setup(), applyBrowserosConfig() writes gateway.controlUi.allowedOrigins using this.port before launchGatewayRuntime has run. If startGateway selects a new ephemeral port (bind conflict on the preferred port), this.port is updated only after startup succeeds, so the allowed-origins list baked into openclaw.json will point to the old port, not the actual bound port. The mismatch won't affect API calls from BrowserOS, but it will break any browser request to the control UI origin (CORS rejection).

Prompt To Fix With AI

This is a comment left during a code review.
Path: packages/browseros-agent/apps/server/src/api/services/openclaw/openclaw-service.ts
Line: 219-238

Comment:
**`allowedOrigins` config may reference a stale port after setup**

In `setup()`, `applyBrowserosConfig()` writes `gateway.controlUi.allowedOrigins` using `this.port` before `launchGatewayRuntime` has run. If `startGateway` selects a new ephemeral port (bind conflict on the preferred port), `this.port` is updated only after startup succeeds, so the allowed-origins list baked into `openclaw.json` will point to the old port, not the actual bound port. The mismatch won't affect API calls from BrowserOS, but it will break any browser request to the control UI origin (CORS rejection).

How can I resolve this? If you propose a fix, please make it concise.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: packages/browseros-agent/apps/server/src/api/services/openclaw/openclaw-service.ts
Line: 1128-1136

Comment:
**`hasFailureState()` treats `recovering` as an error, masking repair progress**

`hasFailureState()` returns `true` when `controlPlaneStatus === 'recovering'`, so `resolveStatus()` always returns `'error'` while `repairRuntime` is active (which sets the status to `'recovering'`). The intended `'starting'` branch in `resolveStatus` is never reached for `'recovering'` because `hasFailureState` short-circuits it. On the client side, `getOpenClawOperatorState` maps `status === 'error'` to `'needs-attention'`, so the UI shows an error card with repair/reset controls during an ongoing repair — the user can re-trigger repair while one is already running.

```suggestion
  private hasFailureState(): boolean {
    return (
      this.controlPlaneStatus === 'failed' ||
      this.lastGatewayError !== null ||
      this.lastError !== null ||
      this.lastRecoveryReason !== null
    )
  }
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: packages/browseros-agent/apps/server/src/api/services/openclaw/openclaw-service.ts
Line: 442-462

Comment:
**`resetRuntime` lets `stopGateway` throw, blocking reset when the gateway is already dead**

Unlike `repairRuntime`, which wraps both `stopGateway` and `stopMachineIfSafe` in their own `try/catch` (best-effort), `resetRuntime` calls `stopGateway()` bare. If the Podman machine is down and the Podman CLI itself fails (exits non-zero for a reason other than a missing container), `stopGateway` throws and `resetRuntime` re-throws without ever clearing `controlPlaneStatus`, `lastError`, or the persisted runtime state. A user hitting "Reset" as a last resort after a machine failure will find that reset itself fails for exactly the same reason.

```typescript
// Suggested fix: mirror the pattern from repairRuntime
async resetRuntime(): Promise<void> {
  try {
    this.stopGatewayLogTail()
    try {
      await this.runtime.stopGateway()
    } catch {
      // Best effort — gateway may already be gone
    }
    try {
      await this.runtime.stopMachineIfSafe()
    } catch {
      // Best effort
    }
    this.controlPlaneStatus = 'disconnected'
    // ... rest of the resets
  }
}
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: packages/browseros-agent/apps/server/src/api/services/openclaw/openclaw-service.ts
Line: 219-238

Comment:
**`allowedOrigins` config may reference a stale port after setup**

In `setup()`, `applyBrowserosConfig()` writes `gateway.controlUi.allowedOrigins` using `this.port` before `launchGatewayRuntime` has run. If `startGateway` selects a new ephemeral port (bind conflict on the preferred port), `this.port` is updated only after startup succeeds, so the allowed-origins list baked into `openclaw.json` will point to the old port, not the actual bound port. The mismatch won't affect API calls from BrowserOS, but it will break any browser request to the control UI origin (CORS rejection).

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: packages/browseros-agent/apps/server/src/api/services/openclaw/container-runtime.ts
Line: 132-149

Comment:
**Redundant conditional structure in the bind-conflict retry loop**

The three separate `if (bindConflict …)` blocks (log, remove, continue) are correct but hard to follow and can be consolidated into a single block. The structure makes it look like the third `if (bindConflict && attempt < GATEWAY_START_MAX_ATTEMPTS) { continue }` is dead code when in fact it is the live `continue` path (the first block only logs).

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: packages/browseros-agent/apps/server/src/api/services/openclaw/openclaw-service.ts
Line: 1032-1048

Comment:
**Unnecessary duck-type guard on `inspectGateway`**

`this.runtime` is statically typed as `ContainerRuntime`, which declares `inspectGateway` as a public method. The cast to `Partial<ContainerRuntime> & { inspectGateway?: … }` and the `typeof runtime.inspectGateway !== 'function'` guard are only needed for test injection via a mock without the method. Consider using a test double that satisfies the full type instead of widening the production code path.

**Rule Used:** Remove unused/dead code rather than leaving it in ... ([source](https://app.greptile.com/review/custom-context?memory=9b045db4-2630-428c-95b7-ccf048d34547))

**Learned From**
[browseros-ai/BrowserOS-agent#126](https://github.com/browseros-ai/BrowserOS-agent/pull/126)

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "fix(openclaw): harden reset and verifica..." | Re-trigger Greptile}

greptile-apps · 2026-04-20T22:50:55Z

+  private hasFailureState(): boolean {
+    return (
+      this.controlPlaneStatus === 'failed' ||
+      this.controlPlaneStatus === 'recovering' ||
+      this.lastGatewayError !== null ||
+      this.lastError !== null ||
+      this.lastRecoveryReason !== null
+    )
+  }


hasFailureState() treats recovering as an error, masking repair progress

hasFailureState() returns true when controlPlaneStatus === 'recovering', so resolveStatus() always returns 'error' while repairRuntime is active (which sets the status to 'recovering'). The intended 'starting' branch in resolveStatus is never reached for 'recovering' because hasFailureState short-circuits it. On the client side, getOpenClawOperatorState maps status === 'error' to 'needs-attention', so the UI shows an error card with repair/reset controls during an ongoing repair — the user can re-trigger repair while one is already running.

Suggested change

private hasFailureState(): boolean {

return (

this.controlPlaneStatus === 'failed' ||

this.controlPlaneStatus === 'recovering' ||

this.lastGatewayError !== null ||

this.lastError !== null ||

this.lastRecoveryReason !== null

)

}

private hasFailureState(): boolean {

return (

this.controlPlaneStatus === 'failed' ||

this.lastGatewayError !== null ||

this.lastError !== null ||

this.lastRecoveryReason !== null

)

}

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/browseros-agent/apps/server/src/api/services/openclaw/openclaw-service.ts Line: 1128-1136 Comment: **`hasFailureState()` treats `recovering` as an error, masking repair progress** `hasFailureState()` returns `true` when `controlPlaneStatus === 'recovering'`, so `resolveStatus()` always returns `'error'` while `repairRuntime` is active (which sets the status to `'recovering'`). The intended `'starting'` branch in `resolveStatus` is never reached for `'recovering'` because `hasFailureState` short-circuits it. On the client side, `getOpenClawOperatorState` maps `status === 'error'` to `'needs-attention'`, so the UI shows an error card with repair/reset controls during an ongoing repair — the user can re-trigger repair while one is already running. ```suggestion private hasFailureState(): boolean { return ( this.controlPlaneStatus === 'failed' || this.lastGatewayError !== null || this.lastError !== null || this.lastRecoveryReason !== null ) } ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-04-20T22:50:56Z

+  async resetRuntime(): Promise<void> {
+    try {
+      this.stopGatewayLogTail()
+      await this.runtime.stopGateway()
+      await this.runtime.stopMachineIfSafe()
+      this.controlPlaneStatus = 'disconnected'
+      this.tokenLoaded = false
+      this.lastGatewayError = null
+      this.lastRecoveryReason = null
+      this.lastError = null
+      this.applyGatewayPort(OPENCLAW_GATEWAY_PORT)
+      await this.saveRuntimeState(this.defaultRuntimeState())
+      logger.info('OpenClaw runtime reset', { port: this.port })
+    } catch (error) {
+      this.controlPlaneStatus = 'failed'
+      this.lastError = error instanceof Error ? error.message : String(error)
+      this.lastGatewayError = this.lastError
+      this.lastRecoveryReason = this.classifyControlPlaneError(error)
+      throw error
+    }
+  }


resetRuntime lets stopGateway throw, blocking reset when the gateway is already dead

Unlike repairRuntime, which wraps both stopGateway and stopMachineIfSafe in their own try/catch (best-effort), resetRuntime calls stopGateway() bare. If the Podman machine is down and the Podman CLI itself fails (exits non-zero for a reason other than a missing container), stopGateway throws and resetRuntime re-throws without ever clearing controlPlaneStatus, lastError, or the persisted runtime state. A user hitting "Reset" as a last resort after a machine failure will find that reset itself fails for exactly the same reason.

// Suggested fix: mirror the pattern from repairRuntime async resetRuntime(): Promise<void> { try { this.stopGatewayLogTail() try { await this.runtime.stopGateway() } catch { // Best effort — gateway may already be gone } try { await this.runtime.stopMachineIfSafe() } catch { // Best effort } this.controlPlaneStatus = 'disconnected' // ... rest of the resets } }

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/browseros-agent/apps/server/src/api/services/openclaw/openclaw-service.ts Line: 442-462 Comment: **`resetRuntime` lets `stopGateway` throw, blocking reset when the gateway is already dead** Unlike `repairRuntime`, which wraps both `stopGateway` and `stopMachineIfSafe` in their own `try/catch` (best-effort), `resetRuntime` calls `stopGateway()` bare. If the Podman machine is down and the Podman CLI itself fails (exits non-zero for a reason other than a missing container), `stopGateway` throws and `resetRuntime` re-throws without ever clearing `controlPlaneStatus`, `lastError`, or the persisted runtime state. A user hitting "Reset" as a last resort after a machine failure will find that reset itself fails for exactly the same reason. ```typescript // Suggested fix: mirror the pattern from repairRuntime async resetRuntime(): Promise<void> { try { this.stopGatewayLogTail() try { await this.runtime.stopGateway() } catch { // Best effort — gateway may already be gone } try { await this.runtime.stopMachineIfSafe() } catch { // Best effort } this.controlPlaneStatus = 'disconnected' // ... rest of the resets } } ``` How can I resolve this? If you propose a fix, please make it concise.

Nikhil (shadowfax92) added 20 commits April 20, 2026 14:12

fix(openclaw): add runtime state persistence

b935f1e

fix(openclaw): validate runtime state on load

e3c9f71

fix(openclaw): validate runtime state numeric fields

6631d22

fix(openclaw): validate runtime state with zod

00f16f1

fix(openclaw): make PodmanRuntime own named machine

57f2a39

test(openclaw): cover factory path and ensureReady

110a1db

refactor(openclaw): infer runtime state from schema

4bdaf2c

fix(openclaw): honor platform in bundled podman path

836accd

feat(openclaw): add gateway host port discovery

71f1c7a

test(openclaw): cover gateway inspect fallbacks

e137108

fix(openclaw): separate inspect capture and retry bind conflicts

68ed868

fix(openclaw): clean up gateway container before retry

6300b16

fix(openclaw): remove gateway container after exhausted retry

3e8cc8b

fix(openclaw): persist runtime host port state

1c207ea

fix(openclaw): harden runtime status handling

0f352f2

feat: add OpenClaw repair and reset controls

014d396

fix: tighten openclaw operator state mapping

b2b2811

fix: harden openclaw chat and podman guards

49112c0

chore: verify openclaw runtime hardening

9cf43bd

fix(openclaw): harden reset and verification tests

a4529c3

github-actions Bot added the fix label Apr 20, 2026

greptile-apps Bot reviewed Apr 20, 2026

View reviewed changes

Nikhil (shadowfax92) closed this Apr 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: harden OpenClaw Podman runtime flow#770

fix: harden OpenClaw Podman runtime flow#770
Nikhil (shadowfax92) wants to merge 20 commits intodevfrom
fix/apr_20-podman-fixes-1

Nikhil (shadowfax92) commented Apr 20, 2026

Uh oh!

Nikhil (shadowfax92) commented Apr 20, 2026

Uh oh!

Nikhil (shadowfax92) commented Apr 20, 2026

Uh oh!

claude Bot commented Apr 20, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 20, 2026

Uh oh!

greptile-apps Bot commented Apr 20, 2026 •

edited

Loading

Comments Outside Diff (1)

Uh oh!

greptile-apps Bot Apr 20, 2026

Uh oh!

greptile-apps Bot Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Nikhil (shadowfax92) commented Apr 20, 2026

Summary

Design

Test plan

Uh oh!

Nikhil (shadowfax92) commented Apr 20, 2026

Uh oh!

Nikhil (shadowfax92) commented Apr 20, 2026

Uh oh!

claude Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review: fix/harden OpenClaw Podman runtime flow

Bug: hasFailureState() includes 'recovering', shadowing resolveStatus()

Smell: inspectGateway() duck-types its own runtime field

Design: PodmanOverridesCard renders twice when operator needs attention

Unused export: reconnectOpenClaw / reconnecting in useOpenClawMutations

Minor: sync fs calls in async route validation

Minor: resetRuntime() applies classifyControlPlaneError to non-gateway errors

Minor: Redundant container removal on last startGateway attempt

Nit: Dialog's newName is not cleared on cancel

Positive callouts

Uh oh!

github-actions Bot commented Apr 20, 2026

❌ Tests failed — 4/830 failed

Uh oh!

greptile-apps Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Flowchart

Comments Outside Diff (1)

Uh oh!

greptile-apps Bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude Bot commented Apr 20, 2026 •

edited

Loading

Bug: `hasFailureState()` includes `'recovering'`, shadowing `resolveStatus()`

Smell: `inspectGateway()` duck-types its own `runtime` field

Design: `PodmanOverridesCard` renders twice when operator needs attention

Unused export: `reconnectOpenClaw` / `reconnecting` in `useOpenClawMutations`

Minor: `resetRuntime()` applies `classifyControlPlaneError` to non-gateway errors

Minor: Redundant container removal on last `startGateway` attempt

Nit: Dialog's `newName` is not cleared on cancel

greptile-apps Bot commented Apr 20, 2026 •

edited

Loading