fix: sort aggregation values to avoid deadlocks in process_pending#1851
Merged
gustavobtflores merged 2 commits intokernelci:mainfrom Apr 13, 2026
Merged
Conversation
There was a problem hiding this comment.
Pull request overview
Fixes PostgreSQL deadlocks on tree_listing caused by concurrent executemany() updates acquiring row locks in inconsistent orders, by sorting batch value lists before executing SQL.
Changes:
- Sorts
_process_tree_listing()UPDATE batches beforecursor.executemany()to enforce a deterministic lock acquisition order. - Sorts
update_tree_listing()UPSERT batches beforecursor.executemany()to enforce a deterministic lock acquisition order.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| backend/kernelCI_app/management/commands/process_pending_aggregations.py | Sorts tree_listing UPDATE batch values prior to executemany() to reduce deadlock risk. |
| backend/kernelCI_app/management/commands/helpers/aggregation_helpers.py | Sorts tree_listing UPSERT batch values prior to executemany() to reduce deadlock risk. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…gregations and update_tree_listing
6fce320 to
4aa88d2
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
MarceloRobert
approved these changes
Apr 13, 2026
| git_commit_hash: str | None | ||
|
|
||
|
|
||
| def tree_listing_sort_key(v: Any) -> tuple: |
Collaborator
There was a problem hiding this comment.
nit: avoid using Any
4aa88d2 to
fc3657b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Fixes a PostgreSQL deadlock in the
tree_listingtable caused by concurrent updates from the ingester and aggregation processor. When two transactions locked overlapping rows in different orders, PostgreSQL detected a deadlock and killed one transaction. This fix ensures all transactions acquire row locks in a deterministic order by sorting values by their lock-key columns before executingexecutemany().Why the deadlock happens
Two concurrent processes update the
tree_listingtable inside separate transactions:The ingester runs
aggregate_checkouts_and_pendings()insidetransaction.atomic(), which callsupdate_tree_listing(), anINSERT ... ON CONFLICT DO UPDATEontree_listing.The aggregation processor runs
_process_tree_listing()insidetransaction.atomic()— a plainUPDATE tree_listingviaexecutemany().Both processes use
cursor.executemany()which sends one statement per row. PostgreSQL acquires row-level locks incrementally as each statement executes. When two transactions lock overlapping rows in different orders, PostgreSQL detects a deadlock and kills one of them.The iteration order came from
dict.values()(insertion order in Python 3.7+), which depends on the arrival order of tests/builds, not on any stable sort key. So two concurrent batches that touched overlappingtree_listingrows could easily lock them in opposite order, causing deadlocks.By sorting both writers by the same lock-key columns in the same deterministic order, all transactions now acquire row locks consistently, making deadlocks theoretically impossible.
Changes
checkout_valuesinupdate_tree_listing()by the unique constraint columns(origin, tree_name, git_repository_url, git_repository_branch, git_commit_hash)before the INSERT ... ON CONFLICT DO UPDATEvaluesin_process_tree_listing()(process_pending_aggregations.py) by the UPDATE WHERE clause columns(origin, tree_name, git_repository_branch, git_repository_url, git_commit_hash)before executing the UPDATEHow to test
docker compose exec test_backend poetry run pytest