Add Codex CLI runtime support#72
Merged
Merged
Conversation
The codex harness patch was replacing _schema.build_prompt_suffix and _runner.build_prompt_suffix globally at import time, so claude_code and open_code runs were also receiving the codex-specific instruction: "Do not try to create .agentfield_output.json yourself; the Codex CLI will persist your final JSON response for AgentField." That instruction is wrong for those providers — Claude / OpenCode are supposed to use their Write tool to create the output file (the fast path the runner expects), and forcing them onto the stdout-parse fallback costs latency, drops the inline schema for small schemas, and sends a confusing instruction referencing a Codex CLI that isn't in the loop. Use a contextvars.ContextVar set by a wrapped Agent.harness so that the suffix dispatcher returns the codex-native suffix only when the active call is for codex, and falls back to the original AgentField suffix for every other provider. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The codex strict-schema patch strips `default` from properties and marks every field as required, so when FastPlanResult flows through Codex the model has to invent a value for `fallback_used`. Despite the prompt example showing `false`, Codex sometimes returns `true` alongside a perfectly valid task list — making the flag meaningless for any downstream consumer that gates on it. `fallback_used` is planner-side state, not an LLM self-assessment: it should be True iff the planner's `_fallback_plan(...)` path ran. Override it back to False after a successful parse so the flag reflects what actually happened, regardless of what the model wrote. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two gotchas surfaced when actually running a full main-mode build with the codex runtime that weren't covered in the existing setup notes: 1. The Docker image bakes ENV HARNESS_MODEL=openrouter/moonshotai/kimi-k2.6 as an OpenCode-side fallback, and SWE-AF's model-resolution env cascade reads HARNESS_MODEL. So a codex deployment that only sets SWE_DEFAULT_RUNTIME=codex (without SWE_DEFAULT_MODEL) hands an OpenRouter Kimi model id to the Codex CLI and the Product Manager reasoner fails in ~13s. Document that SWE_DEFAULT_MODEL=gpt-5.3-codex (or per-build models map) is required to pin the Codex model. 2. Codex CLI's workspace-write sandbox uses bubblewrap (`bwrap`) and needs Linux user namespaces enabled on the host. Docker-on-WSL2 and hardened environments refuse with "bwrap: No permissions to create a new namespace", and the coder agents return success while writing no files. Document the symptom so operators can recognize and fix it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds conservative Codex CLI runtime support to SWE-AF without changing the existing default runtime or replacing Claude/OpenCode/HAX behavior.
This lets both SWE-AF planner and fast agents run through AgentField's Codex provider using either:
codex logincredentials mounted from~/.codex.OPENAI_API_KEY.What Changed
Runtime/provider support
claude_code,open_code, andcodex.runtime: "codex".runtime: "codex".opencode.Codex structured-output support
--output-schema--output-last-message$defs/definitionsso schemas with referenced nested models are accepted by Codex.Docker and compose support
@openai/codexin the Docker image.codexwrapper supportingSWE_CODEX_AUTH_MODE:auto: useOPENAI_API_KEYwhen present, otherwise local Codex login.chatgpt: force ChatGPT-login mode by unsettingOPENAI_API_KEYfor Codex.api_key: requireOPENAI_API_KEY.~/.codexinto bothswe-agentandswe-fastcontainers.SWE_DEFAULT_RUNTIME,SWE_DEFAULT_MODEL,SWE_CODEX_AUTH_MODE, andOPENAI_API_KEYwhere needed..envloading toswe-fastso it behaves consistently withswe-agent.Documentation
.env.example, README, deployment docs, architecture docs, contributing docs, and skill docs.claude_code,open_code,codex.codex loginon the host.SWE_CODEX_AUTH_MODE=chatgptor useautowith noOPENAI_API_KEY.SWE_CODEX_AUTH_MODE=api_key.OPENAI_API_KEY.runtime: "codex"andmodels.default: "gpt-5.3-codex".Files Changed
Runtime/config:
swe_af/runtime/providers.pyswe_af/runtime/__init__.pyswe_af/runtime/codex_harness_patch.pyswe_af/execution/schemas.pyswe_af/fast/schemas.pyswe_af/reasoners/execution_agents.pyswe_af/reasoners/pipeline.pyswe_af/execution/_replanner_compat.pyswe_af/fast/planner.pyswe_af/fast/app.pyswe_af/fast/__init__.pyswe_af/reasoners/__init__.pyDocker/config/docs:
Dockerfiledocker-compose.ymldocker-compose.local.ymlrequirements-docker.txt.env.exampleREADME.mddocs/ARCHITECTURE.mddocs/CONTRIBUTING.mddocs/SKILL.mddocs/deployment.mdTests:
tests/test_model_config.pytests/test_runtime_provider_routing.pytests/test_codex_harness_patch.pytests/test_dockerfile.pytests/fast/test_app.pytests/fast/test_docker_config.pytests/fast/test_fast_init_executor_planner_verifier_routing.pyAudit Trail / Bugs Found During Validation
1. Docker agents exited with code 132 on Linux/aarch64
Observed both
swe-agentandswe-faststart, then exit immediately with code132.Root cause:
cryptography==48.0.0crashed withSIGILLwhile AgentField imported Ed25519 for DID registration.Fix:
cryptography<46inrequirements-docker.txt.Validation:
2.
swe-fastdid not load.envObserved
swe-agenthadenv_file: .env, butswe-fastonly used shell-substituted environment values.Risk:
.envand have planner work while fast silently missed equivalent settings.Fix:
env_file: .envtoswe-fast.3. Codex was invoked but structured output fell back
Initial smoke test proved Codex was triggered, but fast planner returned deterministic fallback:
{ "fallback_used": true, "rationale": "Fallback plan: LLM did not return a parseable result." }Root causes:
.agentfield_output.json, but Codex CLI was running read-only and did not have AgentField's Write tool.$defs, so nested models still had optional/default fields omitted fromrequired.FastTaskschema withinvalid_json_schema.Fixes:
--output-schemaand--output-last-message.$defsanddefinitions.Validation:
FastPlanResultschema now succeeds.swe-fast.fast_plan_tasksnow succeeds withfallback_used: false.Local Validation
Docker stack:
Codex auth inside both containers:
AgentField Codex smoke run:
Smoke execution:
Returned parsed task:
{ "name": "inspect_startup_docs", "title": "Inspect Repository Startup Documentation" }Static checks:
Note: full pytest was not run locally because the host environment did not have pytest installed.
Compatibility Notes
claude_code.