WingedGuardian batch 1: oxwork + jintel + deadlock recovery + cascade fix#1
Conversation
…evenue marketplace Surgically extracted the oxwork-specific additions from PR Conway-Research#254, skipping the bundled harness architecture refactoring. Adds 4 tools (oxwork_browse, oxwork_claim, oxwork_submit, oxwork_status), a 30-min heartbeat scanner, the service_revenue ledger type, and 17 unit tests (17/17 pass). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…x402-paid financial intelligence tools Isolated extraction of the Jintel integration: src/jintel/ directory (7 files), jintel adapter test (8/8 pass), default skill registration, spread into tools array, and spend tracking in executeTool. Adds @yojinhq/jintel-client, zod, zod-to-json-schema deps. All 138 tests pass with zero regressions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… deadlock recovery, balance reporting Surgically extracted the orchestration improvements from PR Conway-Research#298, skipping the bundled harness refactoring. Adds: stale task recovery count cap (3 retries before failing), deadlock detection (pending tasks + no workers = fail goal), child agent balance reporting via status_report KV cache, and 44 new tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…he turn RETRYABLE_STATUS_CODES only covered 429/500/503, so a 404 from a deprecated model (e.g. Cerebras qwen-3-235b) would hard-fail the whole turn with no fallback. Now isRetryableError() returns true for 404 and executeWithRetries immediately cascades to the next provider without wasting the retry budget on a model that will never come back. Regression test verifies the cascade and confirms non-404 errors (400) still hard-fail as before. Fixes Billy's 4-week Cerebras-404 error loop (Conway-Research#266). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 37fa789830
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| agent_pool_optimize: 1_800_000, | ||
| knowledge_store_prune: 86_400_000, | ||
| dead_agent_cleanup: 3_600_000, | ||
| scan_oxwork_tasks: 1_800_000, // 30 min — pure HTTP, zero inference cost |
There was a problem hiding this comment.
Schedule the new 0xWork scanner by default
Adding the builtin task and interval here does not make it run: createHeartbeatDaemon only seeds tasks from heartbeatConfig.entries, which come from src/heartbeat/config.ts's DEFAULT_HEARTBEAT_CONFIG/merged YAML entries. In a normal install with the default heartbeat config, scan_oxwork_tasks is never inserted into heartbeat_schedule, so the advertised periodic 0xWork opportunity scan will not execute unless a user manually adds it to heartbeat.yml.
Useful? React with 👍 / 👎.
| const lines = tasks.map( | ||
| (t) => | ||
| `#${t.id} [${t.status}] $${t.bounty_amount} — ${t.description.slice(0, 100)}${t.description.length > 100 ? "…" : ""}`, | ||
| ); | ||
| return `Your 0xWork tasks (${tasks.length}):\n\n${lines.join("\n")}`; |
There was a problem hiding this comment.
Record approved 0xWork bounties in status checks
When a submitted 0xWork task later becomes approved/completed, this status check only formats the returned tasks and never inserts the new service_revenue transaction that the balance report now counts, despite oxwork_submit deferring revenue recognition until oxwork_status. In that approval scenario, earned bounties will continue to show as zero revenue in financial reports; this path should detect newly paid/completed tasks and record the bounty once, with a de-dup key.
Useful? React with 👍 / 👎.
| return ( | ||
| `Successfully claimed task #${claimed.id}!\n` + | ||
| `Category: ${claimed.category}\n` + | ||
| `Bounty: $${claimed.bounty_amount} USDC\n` + | ||
| `Deadline: ${new Date(claimed.deadline * 1000).toISOString()}\n` + | ||
| `Description: ${claimed.description}` |
There was a problem hiding this comment.
Sanitize claimed task descriptions before returning
When a marketplace task description contains prompt-injection text, oxwork_claim returns claimed.description directly to the model, but only oxwork_browse and oxwork_status were added to EXTERNAL_SOURCE_TOOLS, so executeTool will not run the tool-output sanitizer for this claim result. This exposes untrusted 0xWork content exactly after the agent claims a task; either add oxwork_claim to the external-source set or sanitize the description before returning it.
Useful? React with 👍 / 👎.
What this does
Four targeted changes that take Billy from a stuck 404-loop to a functional autonomous agent with revenue paths:
fix(inference): cascade past 404 model_not_found
This is the critical fix.
RETRYABLE_STATUS_CODESonly covered[429, 500, 503]— a 404 from a deprecated model (Cerebrasqwen-3-235b-a22b-instruct-2507) hard-failed every turn with no fallback. NowisRetryableError()returnstruefor 404 andexecuteWithRetriesimmediately cascades to the next provider (Groq, OpenRouter) without wasting retry budget on a model that will never respond.cherry-pick(0xwork): import PR Conway-Research#254 from tyxben
Surgically extracted oxwork-specific additions from upstream PR Conway-Research#254 (skipping the bundled harness refactoring). Adds 4 tools (
oxwork_browse,oxwork_claim,oxwork_submit,oxwork_status), a 30-min heartbeat scanner, andservice_revenueledger type. 17/17 tests pass.cherry-pick(jintel): import PR Conway-Research#318 from 0xtechdean
41 x402-paid financial intelligence tools via
@yojinhq/jintel-client. Isolated undersrc/jintel/, spread into tool registry, default skill registered. 8/8 adapter tests pass.cherry-pick(orchestration): import PR Conway-Research#298 from wanikua
Stale task recovery count cap (3 retries before failing a task), deadlock detection (pending tasks + no available workers → fail goal cleanly instead of looping forever). Child agent balance reporting via
status_reportKV cache. 44 new tests.Test summary
What this does NOT do
🤖 Generated with Claude Code