listCodexModels spawns new codex app-server on every request, causing intermittent 120s timeouts

## Problem

`GET /api/sessions/{id}/codex-models` intermittently returns **500 after 120 seconds**, which gets wrapped as a **502 Bad Gateway** by Cloudflare Tunnel.

### Root Cause

In `cli/src/modules/common/codexModels.ts`, `listCodexModels()` creates a **new `CodexAppServerClient` on every call**, which spawns a fresh `codex app-server` child process:

```ts
export async function listCodexModels(includeHidden: boolean = false): Promise<CodexModelSummary[]> {
    const client = new CodexAppServerClient();
    try {
        await client.connect();        // spawns "codex app-server" process
        await client.initialize(...);   // 30s timeout
        const response = await client.listModels({ includeHidden }); // 30s timeout
        ...
    } finally {
        await client.disconnect();     // kills the process
    }
}
```

The hub-side RPC timeout is `MODEL_LIST_RPC_TIMEOUT_MS = 120_000` (120s). When the spawned `codex app-server` is slow to respond (e.g., OpenAI token refresh stalls), the RPC times out at 120s and returns 500.

### Evidence

From `hub.log`, out of ~168 `codex-models` requests:
- **163 (97%)** succeeded in 1-5 seconds
- **5 (3%)** timed out at exactly 120s → 500

From `runner.log`, the `codex app-server` processes spawned for model listing consistently exit within ~1 second:
```
[09:21:39.748] List Codex models request
[09:21:39.748] [CodexAppServer] Connected
[09:21:40.147] Codex app-server exited (code=0, signal=null)
[09:21:40.157] [CodexAppServer] Disconnected
```

The 1-second exit suggests the app-server sometimes fails silently (exits before completing `listModels`), triggering the full 120s RPC timeout on the hub side.

### Environment

- hapi: 0.19.0
- codex-cli: 0.136.0
- OS: Ubuntu 24.04 (Linux x86_64)
- codex auth mode: chatgpt (OAuth token with refresh)

### Suggested Fix

1. **Cache model list** on the runner/machine level (models rarely change, cache for 5-10 minutes)
2. **Reuse a persistent app-server** instead of spawn-per-request
3. **Reduce RPC timeout** — 120s is excessive for a model list that normally takes <5s
4. **Add faster failure detection** — if the app-server exits early, return an error immediately instead of waiting for the full RPC timeout

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

listCodexModels spawns new codex app-server on every request, causing intermittent 120s timeouts #806

Problem

Root Cause

Evidence

Environment

Suggested Fix

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

listCodexModels spawns new codex app-server on every request, causing intermittent 120s timeouts #806

Description

Problem

Root Cause

Evidence

Environment

Suggested Fix

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions