fix(workflows): make task failures non-blocking by kelvinkipruto · Pull Request #555 · CodeForAfrica/PromiseTracker

kelvinkipruto · 2026-03-06T13:53:09Z

Description

Wrap each workflow task in a shared runTask helper so that a single task failure no longer aborts the workflow.
Previously, an unhandled task error would mark the workflow job as failed in payload-jobs, blocking all subsequent scheduled runs until the table was cleared manually. Now each task catches its own error, logs it to Sentry, and the workflow always completes — allowing the next cron cycle to start cleanly.

Fixes # (issue)

Type of change

Bug fix (non-breaking change which fixes an issue)

Screenshots

Checklist:

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation

Wrap each workflow task in a shared runTask helper so that a single task failure no longer aborts the workflow. Previously, an unhandled task error would mark the workflow job as failed in payload-jobs, blocking all subsequent scheduled runs until the table was cleared manually. Now each task catches its own error, logs it to Sentry, and the workflow always completes — allowing the next cron cycle to start cleanly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

vercel · 2026-03-06T13:53:15Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
PromiseTracker	Ready	Preview, Comment	Mar 6, 2026 2:21pm

kelvinkipruto · 2026-03-06T13:53:26Z

@claude Review

kelvinkipruto · 2026-03-06T13:53:34Z

@codex review

claude · 2026-03-06T13:53:42Z

Claude finished @kelvinkipruto's task in 1m 8s —— View job

Code Review

Gather context
Read changed files
Analyze diff
Post review feedback

The change solves the immediate problem (blocked cron cycles due to unhandled task errors) with a clean, minimal helper. Overall the approach is solid, but there are a few things worth discussing.

Issues

1. Silent failure propagation through dependent tasks

src/workflows/airtableWorkflow.ts — The 7 tasks appear to form a pipeline with real data dependencies:

createTenantFromAirtable → createPoliticalEntity → fetchAirtableDocuments
  → downloadDocuments → extractDocuments → extractPromises → uploadToMeedan

With runTask swallowing all errors, if fetchAirtableDocuments fails (e.g. Airtable is down), downloadDocuments, extractDocuments, extractPromises, and uploadToMeedan will still run against stale or missing data. Each will also fail and be silently swallowed. The workflow will report as "complete" in payload-jobs, but no data was actually processed. This is arguably harder to debug than the original blocking failure.

The same concern applies to src/workflows/meedanWorkflow.ts: if fetchPromiseStatuses fails, updatePromiseStatus runs with no fresh data.

A suggestion would be to return a boolean from runTask and abort remaining tasks if a critical upstream one fails: Fix this →

2. `withWorkflowErrorCapture` is now effectively dead code for task errors

src/workflows/utils.ts:164–193 — withWorkflowErrorCapture wraps the handler to catch and report errors. Since runTask now swallows all task errors before they can bubble up, the outer catch in withWorkflowErrorCapture will never fire for task failures. It now only guards against errors thrown by the handler function itself (e.g. a bug in the handler setup code), which is a very narrow case. This is worth documenting or acknowledging to avoid confusion.

3. Duplicate Sentry reporting

src/workflows/utils.ts:13–19 — Both Sentry.logger.error and Sentry.captureException are called for every failure. Depending on whether Sentry.logger is configured to auto-capture as exceptions, this could create two Sentry issues per failure. Consider whether both are needed, or if captureException alone (with a breadcrumb/log attachment) is sufficient.

Additionally, Sentry.logger.error includes only error.message (line 16), discarding the stack trace. The full exception detail is preserved by captureException, but losing the stack in the log context can make log-based debugging harder.

Minor Suggestions

runTask return value (src/workflows/utils.ts:5) — The function currently returns Promise<void> (implicitly). If tasks ever need to return values, or if the short-circuit pattern above is adopted, making the return type explicit as Promise<boolean> would be a cheap improvement in expressiveness.

workflow parameter is redundant — The workflow name is available in the Sentry span/transaction context set up by withWorkflowContext. Passing it explicitly into every runTask call at the call site is slightly noisy. This is minor — just worth considering if the helper is extended.

Summary

The fix correctly addresses the cron-blocking regression with minimal code. The main risk is that the pipeline now silently completes even when all tasks fail mid-way — operators need to rely on Sentry alerts to detect problems. If the tasks are truly independent (idempotent and not pipelined), this is fine as-is. If they form a data dependency chain, consider adding a bail-out mechanism for upstream failures.
| Branch

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c9aa6e2681

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

src/workflows/utils.ts

Task errors are now swallowed by runTask before they can bubble up, so withWorkflowErrorCapture no longer fires for task failures. Add a comment acknowledging this so it is not mistaken for dead code — it remains as a safety net for errors thrown in the handler setup code itself, outside any runTask call. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Replaced Sentry.logger.error + captureException with captureException alone. Tasks using withTaskTracing already capture exceptions before re-throwing, so the previous two-call pattern risked triple-reporting the same error. captureException also preserves the full stack trace, which Sentry.logger.error discarded by only passing error.message. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Drop the workflow parameter — the Sentry span context from withWorkflowContext already carries the workflow name, so passing it explicitly at every call site was redundant noise. - Add explicit Promise<boolean> return type: true on success, false on failure, making the helper's intent clearer if callers ever need to act on the result. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

kelvinkipruto · 2026-03-06T14:09:28Z

@claude Review.
On

Silent failure propagation through dependent tasks

The intended approach is for each task to run independently, a failure on one should not affect other tasks

claude · 2026-03-06T14:09:41Z

Claude finished @kelvinkipruto's task in 1m 0s —— View job

Code Review

Gather context
Read changed files
Analyze changes
Post review feedback

Thanks for the clarification — understood that tasks are designed to be independent. That resolves concern #1. Here's the updated review focusing on what remains.

The approach looks solid

The runTask helper is clean and minimal. Wrapping each task individually, catching errors, reporting to Sentry, and letting the workflow always complete is a well-reasoned design for independent tasks. The comment added at src/workflows/utils.ts:162–165 correctly documents the narrowed scope of withWorkflowErrorCapture — that was a helpful addition.

Remaining points

1. `runTask` return value goes unused

runTask already returns Promise<boolean> (true/false), but neither airtableWorkflow.ts nor meedanWorkflow.ts uses the value. This is fine since tasks are independent, but if you ever want to add task-level observability (e.g., logging a summary of which tasks succeeded/failed per run), the return values are there to use. No change needed — just noting it's available.

2. Potential duplicate Sentry reporting in `withWorkflowErrorCapture`

src/workflows/utils.ts:181–191 — For handler-level errors (i.e., bugs outside any runTask call), withWorkflowErrorCapture calls both Sentry.logger.error and Sentry.captureException. Depending on Sentry SDK configuration, logger.error may also auto-submit as an exception, potentially creating two Sentry issues for one error. Also note that line 183 only logs error.message — the full stack is only preserved by captureException.

Since handler-level errors are now rare (only setup bugs would reach this path), the practical impact is low. But if you want to clean it up: Fix this →

3. Sequential execution of independent tasks (minor)

All tasks in both workflows are awaited sequentially (await runTask(...) one at a time). Since the tasks are independent, Promise.allSettled would run them concurrently and reduce total wall-clock time per workflow run. This is a minor performance suggestion — sequential is simpler and avoids any risk of resource contention, so it's reasonable to keep as-is.

Summary

The fix is correct and well-scoped. The only actionable item worth addressing is the potential duplicate Sentry reporting in withWorkflowErrorCapture (point 2), but even that is low-priority given how rarely that path now fires. Otherwise this is good to merge.

| Branch

Remove duplicate Sentry captureMessage call in task logger to avoid double reporting. Remove Sentry.logger.error call in workflow error capture as Sentry.captureException already handles error reporting. Change cleanup failed jobs schedule from daily to hourly to improve system maintenance.

vercel bot deployed to Preview March 6, 2026 13:55 View deployment

chatgpt-codex-connector bot reviewed Mar 6, 2026

View reviewed changes

src/workflows/utils.ts Outdated Show resolved Hide resolved

kelvinkipruto and others added 3 commits March 6, 2026 17:01

style(workflows): improve code formatting and remove obsolete comment

011a071

vercel bot deployed to Preview March 6, 2026 14:08 View deployment

vercel bot deployed to Preview March 6, 2026 14:12 View deployment

kelvinkipruto merged commit e5039b0 into main Mar 6, 2026
5 of 6 checks passed

kelvinkipruto deleted the fix/non-blocking-workflow-tasks branch March 6, 2026 14:19

vercel bot deployed to Preview March 6, 2026 14:21 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(workflows): make task failures non-blocking#555

fix(workflows): make task failures non-blocking#555
kelvinkipruto merged 6 commits intomainfrom
fix/non-blocking-workflow-tasks

kelvinkipruto commented Mar 6, 2026

Uh oh!

vercel bot commented Mar 6, 2026 •

edited

Loading

Uh oh!

kelvinkipruto commented Mar 6, 2026

Uh oh!

kelvinkipruto commented Mar 6, 2026

Uh oh!

claude bot commented Mar 6, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

kelvinkipruto commented Mar 6, 2026

Uh oh!

claude bot commented Mar 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kelvinkipruto commented Mar 6, 2026

Description

Type of change

Screenshots

Checklist:

Uh oh!

vercel bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kelvinkipruto commented Mar 6, 2026

Uh oh!

kelvinkipruto commented Mar 6, 2026

Uh oh!

claude bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review

Issues

1. Silent failure propagation through dependent tasks

2. withWorkflowErrorCapture is now effectively dead code for task errors

3. Duplicate Sentry reporting

Minor Suggestions

Summary

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

kelvinkipruto commented Mar 6, 2026

Uh oh!

claude bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review

The approach looks solid

Remaining points

1. runTask return value goes unused

2. Potential duplicate Sentry reporting in withWorkflowErrorCapture

3. Sequential execution of independent tasks (minor)

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel bot commented Mar 6, 2026 •

edited

Loading

claude bot commented Mar 6, 2026 •

edited

Loading

2. `withWorkflowErrorCapture` is now effectively dead code for task errors

claude bot commented Mar 6, 2026 •

edited

Loading

1. `runTask` return value goes unused

2. Potential duplicate Sentry reporting in `withWorkflowErrorCapture`