fix(orchestration): deadlock recovery, balance reporting, tests & meow skill#298
Open
wanikua wants to merge 1 commit into
Open
fix(orchestration): deadlock recovery, balance reporting, tests & meow skill#298wanikua wants to merge 1 commit into
wanikua wants to merge 1 commit into
Conversation
- Fix orchestrator deadlock where dead workers cause infinite stale task recovery loops (Conway-Research#266, Conway-Research#259). Add max recovery count per task, mark dead workers in agent tracker, and detect deadlock state. - Add child agent balance reporting via status_report messages. SimpleFundingProtocol.getBalance() now prefers recent reported balance over funded_amount_cents upper-bound estimate. - Add tests for credits, inference budget, config, orchestrator deadlock, and balance reporting (48 new tests). - Add meow-protocol skill (SKILL.md) for token-efficient inter-agent communication using VQ-VAE discrete codebooks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
WingedGuardian
added a commit
to WingedGuardian/automaton
that referenced
this pull request
May 30, 2026
… fix (#1) * cherry-pick(0xwork): import Conway-Research#254 from tyxben — agent revenue marketplace Surgically extracted the oxwork-specific additions from PR Conway-Research#254, skipping the bundled harness architecture refactoring. Adds 4 tools (oxwork_browse, oxwork_claim, oxwork_submit, oxwork_status), a 30-min heartbeat scanner, the service_revenue ledger type, and 17 unit tests (17/17 pass). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * cherry-pick(jintel): import Conway-Research#318 from 0xtechdean — 41 x402-paid financial intelligence tools Isolated extraction of the Jintel integration: src/jintel/ directory (7 files), jintel adapter test (8/8 pass), default skill registration, spread into tools array, and spend tracking in executeTool. Adds @yojinhq/jintel-client, zod, zod-to-json-schema deps. All 138 tests pass with zero regressions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * cherry-pick(orchestration): import Conway-Research#298 from wanikua — deadlock recovery, balance reporting Surgically extracted the orchestration improvements from PR Conway-Research#298, skipping the bundled harness refactoring. Adds: stale task recovery count cap (3 retries before failing), deadlock detection (pending tasks + no workers = fail goal), child agent balance reporting via status_report KV cache, and 44 new tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(inference): cascade past 404 model_not_found instead of failing the turn RETRYABLE_STATUS_CODES only covered 429/500/503, so a 404 from a deprecated model (e.g. Cerebras qwen-3-235b) would hard-fail the whole turn with no fallback. Now isRetryableError() returns true for 404 and executeWithRetries immediately cascades to the next provider without wasting the retry budget on a model that will never come back. Regression test verifies the cascade and confirms non-404 errors (400) still hard-fail as before. Fixes Billy's 4-week Cerebras-404 error loop (Conway-Research#266). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Genesis <genesis@wingedguardian.dev> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
executingwith stalelocal://assignment causes repeatedcreate_goal BLOCKEDsleep loop #259): Dead workers no longer cause infinite stale task recovery loops. Added max recovery count per task (3 attempts), dead worker marking in agent tracker, and deadlock detection (pending tasks + no available workers = fail goal).handleStatusReportin messaging now extractscredit_balancefrom status reports and stores in KV.SimpleFundingProtocol.getBalance()prefers recent reported balance (< 10 min) overfunded_amount_centsupper-bound estimate.Test plan
credits.test.ts— survival tier boundaries, formatting, financial state (14 tests)budget.test.ts— per-call ceiling, hourly budget, cost recording/queries (10 tests)config.test.ts— config creation, defaults merging, path resolution (12 tests)orchestrator-deadlock.test.ts— dead worker marking, max recovery, recovery count (3 tests)balance-report.test.ts— reported vs funded balance, staleness, fallback (5 tests)meow-skill.test.ts— SKILL.md parsing, frontmatter, requirements (4 tests)pnpm typecheckpasses with zero errors🤖 Generated with Claude Code