Skip to content

fix: sort aggregation values to avoid deadlocks in process_pending#1851

Merged
gustavobtflores merged 2 commits intokernelci:mainfrom
gustavobtflores:fix/tree-listing-lock
Apr 13, 2026
Merged

fix: sort aggregation values to avoid deadlocks in process_pending#1851
gustavobtflores merged 2 commits intokernelci:mainfrom
gustavobtflores:fix/tree-listing-lock

Conversation

@gustavobtflores
Copy link
Copy Markdown
Contributor

Description

Fixes a PostgreSQL deadlock in the tree_listing table caused by concurrent updates from the ingester and aggregation processor. When two transactions locked overlapping rows in different orders, PostgreSQL detected a deadlock and killed one transaction. This fix ensures all transactions acquire row locks in a deterministic order by sorting values by their lock-key columns before executing executemany().

Why the deadlock happens

Two concurrent processes update the tree_listing table inside separate transactions:

  1. The ingester runs aggregate_checkouts_and_pendings() inside transaction.atomic(), which calls update_tree_listing(), an INSERT ... ON CONFLICT DO UPDATE on tree_listing.

  2. The aggregation processor runs _process_tree_listing() inside transaction.atomic() — a plain UPDATE tree_listing via executemany().

Both processes use cursor.executemany() which sends one statement per row. PostgreSQL acquires row-level locks incrementally as each statement executes. When two transactions lock overlapping rows in different orders, PostgreSQL detects a deadlock and kills one of them.

The iteration order came from dict.values() (insertion order in Python 3.7+), which depends on the arrival order of tests/builds, not on any stable sort key. So two concurrent batches that touched overlapping tree_listing rows could easily lock them in opposite order, causing deadlocks.

By sorting both writers by the same lock-key columns in the same deterministic order, all transactions now acquire row locks consistently, making deadlocks theoretically impossible.

Changes

  • Sort checkout_values in update_tree_listing() by the unique constraint columns (origin, tree_name, git_repository_url, git_repository_branch, git_commit_hash) before the INSERT ... ON CONFLICT DO UPDATE
  • Sort values in _process_tree_listing() (process_pending_aggregations.py) by the UPDATE WHERE clause columns (origin, tree_name, git_repository_branch, git_repository_url, git_commit_hash) before executing the UPDATE

How to test

  1. Run the backend test suite: docker compose exec test_backend poetry run pytest
  2. Verify no regressions in tree listing aggregation tests

@gustavobtflores gustavobtflores self-assigned this Apr 13, 2026
@gustavobtflores gustavobtflores added the bug Something isn't working label Apr 13, 2026
@gustavobtflores gustavobtflores requested a review from Copilot April 13, 2026 14:18
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes PostgreSQL deadlocks on tree_listing caused by concurrent executemany() updates acquiring row locks in inconsistent orders, by sorting batch value lists before executing SQL.

Changes:

  • Sorts _process_tree_listing() UPDATE batches before cursor.executemany() to enforce a deterministic lock acquisition order.
  • Sorts update_tree_listing() UPSERT batches before cursor.executemany() to enforce a deterministic lock acquisition order.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
backend/kernelCI_app/management/commands/process_pending_aggregations.py Sorts tree_listing UPDATE batch values prior to executemany() to reduce deadlock risk.
backend/kernelCI_app/management/commands/helpers/aggregation_helpers.py Sorts tree_listing UPSERT batch values prior to executemany() to reduce deadlock risk.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread backend/kernelCI_app/management/commands/process_pending_aggregations.py Outdated
Comment thread backend/kernelCI_app/management/commands/process_pending_aggregations.py Outdated
Comment thread backend/kernelCI_app/management/commands/helpers/aggregation_helpers.py Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Collaborator

@MarceloRobert MarceloRobert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

git_commit_hash: str | None


def tree_listing_sort_key(v: Any) -> tuple:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: avoid using Any

Copy link
Copy Markdown
Contributor

@alanpeixinho alanpeixinho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gustavobtflores gustavobtflores added this pull request to the merge queue Apr 13, 2026
Merged via the queue into kernelci:main with commit d1cbe99 Apr 13, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants