Skip to content

fix: run scheduled jobs in background tasks to prevent bus blocking#177

Open
sakhnenkoff wants to merge 1 commit intoRichardAtCT:mainfrom
sakhnenkoff:patch/concurrent-dispatch
Open

fix: run scheduled jobs in background tasks to prevent bus blocking#177
sakhnenkoff wants to merge 1 commit intoRichardAtCT:mainfrom
sakhnenkoff:patch/concurrent-dispatch

Conversation

@sakhnenkoff
Copy link
Copy Markdown

Summary

Fixes #174.

AgentHandler.handle_scheduled currently awaits claude.run_command() directly, which blocks the event bus for the entire duration of a Claude execution (often 5-10+ minutes). This causes subsequent scheduled jobs to be delayed or missed entirely.

Changes:

  • handle_scheduled now dispatches work via asyncio.create_task() and returns immediately
  • New _run_scheduled method runs the actual Claude work under an asyncio.Semaphore(2) to cap concurrent executions
  • New _task_done callback cleans up finished tasks and logs errors (prevents "Task exception was never retrieved" warnings)
  • handle_webhook is unchanged — webhooks are fast and blocking is fine

Why not modify EventBus instead?

Changing the bus dispatcher to run events concurrently would break ordering guarantees, risk fire-and-forget exception leaks, and require verifying all handlers for concurrent safety. Scoping the fix to AgentHandler.handle_scheduled is minimal and safe.

Tests

4 new tests in tests/unit/test_events/test_concurrent_scheduled.py:

  • test_handle_scheduled_returns_immediately — verifies dispatch completes within 1s even when Claude would take 10s
  • test_scheduled_semaphore_limits_concurrency — verifies semaphore caps at 2
  • test_scheduled_task_errors_are_logged_not_raised — confirms background errors are logged, not propagated
  • test_webhook_handler_still_blocks — verifies webhook handling still awaits directly

All 538 existing tests continue to pass.

AgentHandler.handle_scheduled now dispatches work via asyncio.create_task
and returns immediately, preventing long Claude executions from blocking
the event bus and causing missed heartbeats. A semaphore (max 2) caps
concurrent Claude executions. handle_webhook remains unchanged.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Scheduled jobs missed when bot is busy processing a message

1 participant