Skip to content

WingedGuardian batch 1: oxwork + jintel + deadlock recovery + cascade fix#1

Merged
WingedGuardian merged 4 commits into
mainfrom
wg/cherry-pick-batch-1
May 30, 2026
Merged

WingedGuardian batch 1: oxwork + jintel + deadlock recovery + cascade fix#1
WingedGuardian merged 4 commits into
mainfrom
wg/cherry-pick-batch-1

Conversation

@WingedGuardian

Copy link
Copy Markdown
Owner

What this does

Four targeted changes that take Billy from a stuck 404-loop to a functional autonomous agent with revenue paths:

fix(inference): cascade past 404 model_not_found

This is the critical fix. RETRYABLE_STATUS_CODES only covered [429, 500, 503] — a 404 from a deprecated model (Cerebras qwen-3-235b-a22b-instruct-2507) hard-failed every turn with no fallback. Now isRetryableError() returns true for 404 and executeWithRetries immediately cascades to the next provider (Groq, OpenRouter) without wasting retry budget on a model that will never respond.

cherry-pick(0xwork): import PR Conway-Research#254 from tyxben

Surgically extracted oxwork-specific additions from upstream PR Conway-Research#254 (skipping the bundled harness refactoring). Adds 4 tools (oxwork_browse, oxwork_claim, oxwork_submit, oxwork_status), a 30-min heartbeat scanner, and service_revenue ledger type. 17/17 tests pass.

cherry-pick(jintel): import PR Conway-Research#318 from 0xtechdean

41 x402-paid financial intelligence tools via @yojinhq/jintel-client. Isolated under src/jintel/, spread into tool registry, default skill registered. 8/8 adapter tests pass.

cherry-pick(orchestration): import PR Conway-Research#298 from wanikua

Stale task recovery count cap (3 retries before failing a task), deadlock detection (pending tasks + no available workers → fail goal cleanly instead of looping forever). Child agent balance reporting via status_report KV cache. 44 new tests.

Test summary

  • 25/25 inference tests (includes 2 new 404-cascade regression tests)
  • 17/17 oxwork tests
  • 8/8 jintel adapter tests
  • 3/3 orchestrator deadlock tests
  • 5/5 balance report tests
  • 71/71 tools-security tests
  • 23/23 financial tests
  • 19/19 heartbeat tests

What this does NOT do

🤖 Generated with Claude Code

Genesis and others added 4 commits May 29, 2026 20:35
…evenue marketplace

Surgically extracted the oxwork-specific additions from PR Conway-Research#254, skipping
the bundled harness architecture refactoring. Adds 4 tools (oxwork_browse,
oxwork_claim, oxwork_submit, oxwork_status), a 30-min heartbeat scanner,
the service_revenue ledger type, and 17 unit tests (17/17 pass).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…x402-paid financial intelligence tools

Isolated extraction of the Jintel integration: src/jintel/ directory (7 files),
jintel adapter test (8/8 pass), default skill registration, spread into tools
array, and spend tracking in executeTool. Adds @yojinhq/jintel-client, zod,
zod-to-json-schema deps. All 138 tests pass with zero regressions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… deadlock recovery, balance reporting

Surgically extracted the orchestration improvements from PR Conway-Research#298, skipping the
bundled harness refactoring. Adds: stale task recovery count cap (3 retries
before failing), deadlock detection (pending tasks + no workers = fail goal),
child agent balance reporting via status_report KV cache, and 44 new tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…he turn

RETRYABLE_STATUS_CODES only covered 429/500/503, so a 404 from a deprecated
model (e.g. Cerebras qwen-3-235b) would hard-fail the whole turn with no
fallback. Now isRetryableError() returns true for 404 and executeWithRetries
immediately cascades to the next provider without wasting the retry budget on
a model that will never come back. Regression test verifies the cascade and
confirms non-404 errors (400) still hard-fail as before.

Fixes Billy's 4-week Cerebras-404 error loop (Conway-Research#266).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@WingedGuardian WingedGuardian merged commit 5fad708 into main May 30, 2026
@WingedGuardian WingedGuardian deleted the wg/cherry-pick-batch-1 branch May 30, 2026 02:19

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 37fa789830

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/heartbeat/tasks.ts
agent_pool_optimize: 1_800_000,
knowledge_store_prune: 86_400_000,
dead_agent_cleanup: 3_600_000,
scan_oxwork_tasks: 1_800_000, // 30 min — pure HTTP, zero inference cost

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Schedule the new 0xWork scanner by default

Adding the builtin task and interval here does not make it run: createHeartbeatDaemon only seeds tasks from heartbeatConfig.entries, which come from src/heartbeat/config.ts's DEFAULT_HEARTBEAT_CONFIG/merged YAML entries. In a normal install with the default heartbeat config, scan_oxwork_tasks is never inserted into heartbeat_schedule, so the advertised periodic 0xWork opportunity scan will not execute unless a user manually adds it to heartbeat.yml.

Useful? React with 👍 / 👎.

Comment thread src/agent/tools.ts
Comment on lines +3419 to +3423
const lines = tasks.map(
(t) =>
`#${t.id} [${t.status}] $${t.bounty_amount} — ${t.description.slice(0, 100)}${t.description.length > 100 ? "…" : ""}`,
);
return `Your 0xWork tasks (${tasks.length}):\n\n${lines.join("\n")}`;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Record approved 0xWork bounties in status checks

When a submitted 0xWork task later becomes approved/completed, this status check only formats the returned tasks and never inserts the new service_revenue transaction that the balance report now counts, despite oxwork_submit deferring revenue recognition until oxwork_status. In that approval scenario, earned bounties will continue to show as zero revenue in financial reports; this path should detect newly paid/completed tasks and record the bounty once, with a de-dup key.

Useful? React with 👍 / 👎.

Comment thread src/agent/tools.ts
Comment on lines +3319 to +3324
return (
`Successfully claimed task #${claimed.id}!\n` +
`Category: ${claimed.category}\n` +
`Bounty: $${claimed.bounty_amount} USDC\n` +
`Deadline: ${new Date(claimed.deadline * 1000).toISOString()}\n` +
`Description: ${claimed.description}`

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Sanitize claimed task descriptions before returning

When a marketplace task description contains prompt-injection text, oxwork_claim returns claimed.description directly to the model, but only oxwork_browse and oxwork_status were added to EXTERNAL_SOURCE_TOOLS, so executeTool will not run the tool-output sanitizer for this claim result. This exposes untrusted 0xWork content exactly after the agent claims a task; either add oxwork_claim to the external-source set or sanitize the description before returning it.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant