fix(capacity): copy parent ancestors to avoid slice aliasing in hiera…#5502
fix(capacity): copy parent ancestors to avoid slice aliasing in hiera…#5502WHOIM1205 wants to merge 1 commit into
Conversation
…rchical queues Signed-off-by: WHOIM1205 <rathourprateek8@gmail.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @hajnalmt , this PR includes a fix for the hierarchical queue ancestor slice-aliasing issue along with a regression test. All tests are passing. I'd appreciate a review when you have time. Thanks! |
There was a problem hiding this comment.
Code Review
This pull request addresses a slice-aliasing bug in the capacity scheduler plugin where appending to a parent's ancestors slice could corrupt the ancestor chain in deep hierarchies due to shared backing arrays. The fix introduces manual slice allocation and copying to guarantee fresh backing arrays. The reviewer suggests using idiomatic 3-index slice expressions (slice[max:len:cap]) instead of verbose manual allocations and copies to simplify the code while still preventing slice-aliasing issues.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| parentAncestors := cp.queueOpts[parentInfo.UID].ancestors | ||
| ancestors := make([]api.QueueID, len(parentAncestors)+1) | ||
| copy(ancestors, parentAncestors) | ||
| ancestors[len(ancestors)-1] = parentInfo.UID | ||
| cp.queueOpts[queue.UID].ancestors = ancestors |
There was a problem hiding this comment.
Using a 3-index slice expression (parentAncestors[:len(parentAncestors):len(parentAncestors)]) is a highly idiomatic and concise way in Go to append to a slice while guaranteeing that the original slice's backing array is never mutated. This completely avoids the need for manual slice allocation, copying, and indexing.
| parentAncestors := cp.queueOpts[parentInfo.UID].ancestors | |
| ancestors := make([]api.QueueID, len(parentAncestors)+1) | |
| copy(ancestors, parentAncestors) | |
| ancestors[len(ancestors)-1] = parentInfo.UID | |
| cp.queueOpts[queue.UID].ancestors = ancestors | |
| parentAncestors := cp.queueOpts[parentInfo.UID].ancestors | |
| cp.queueOpts[queue.UID].ancestors = append(parentAncestors[:len(parentAncestors):len(parentAncestors)], parentInfo.UID) |
| ancestors := cp.queueOpts[queue.UID].ancestors | ||
| list := make([]api.QueueID, 0, len(ancestors)+1) | ||
| list = append(list, ancestors...) | ||
| list = append(list, queue.UID) |
There was a problem hiding this comment.
Using a 3-index slice expression (ancestors[:len(ancestors):len(ancestors)]) is a highly idiomatic and concise way in Go to append to a slice while guaranteeing that the original slice's backing array is never mutated. This simplifies the verbose make and double append boilerplate into a single, clean operation.
| ancestors := cp.queueOpts[queue.UID].ancestors | |
| list := make([]api.QueueID, 0, len(ancestors)+1) | |
| list = append(list, ancestors...) | |
| list = append(list, queue.UID) | |
| ancestors := cp.queueOpts[queue.UID].ancestors | |
| list := append(ancestors[:len(ancestors):len(ancestors)], queue.UID) |
| ancestors := cp.queueOpts[queue.UID].ancestors | ||
| list := make([]api.QueueID, 0, len(ancestors)+1) | ||
| list = append(list, ancestors...) | ||
| list = append(list, queue.UID) |
There was a problem hiding this comment.
Using a 3-index slice expression (ancestors[:len(ancestors):len(ancestors)]) is a highly idiomatic and concise way in Go to append to a slice while guaranteeing that the original slice's backing array is never mutated. This simplifies the verbose make and double append boilerplate into a single, clean operation.
| ancestors := cp.queueOpts[queue.UID].ancestors | |
| list := make([]api.QueueID, 0, len(ancestors)+1) | |
| list = append(list, ancestors...) | |
| list = append(list, queue.UID) | |
| ancestors := cp.queueOpts[queue.UID].ancestors | |
| list := append(ancestors[:len(ancestors):len(ancestors)], queue.UID) |
There was a problem hiding this comment.
Pull request overview
Fixes a Go slice-aliasing bug in the capacity plugin’s hierarchical queue ancestor construction, preventing corruption of ancestor chains in deep queue trees that could lead to incorrect hierarchical quota enforcement and resource accounting.
Changes:
- Rebuild ancestor chains in
updateAncestors()using a fresh slice allocation + copy to avoid backing-array aliasing between sibling subtrees. - Avoid appending onto stored
ancestorsslices in hierarchical allocatable/enqueueable checks by building a separate list before traversal. - Add a regression test that constructs a deep, branching queue hierarchy and validates exact ancestor paths for each queue.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| pkg/scheduler/plugins/capacity/capacity.go | Eliminates ancestor-slice backing-array aliasing and hardens hierarchical checks against accidental mutation. |
| pkg/scheduler/plugins/capacity/capacity_test.go | Adds regression coverage for deep hierarchical queue ancestry correctness across sibling branches. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ancestors := cp.queueOpts[queue.UID].ancestors | ||
| list := make([]api.QueueID, 0, len(ancestors)+1) | ||
| list = append(list, ancestors...) | ||
| list = append(list, queue.UID) |
| func (cp *capacityPlugin) checkJobEnqueueableHierarchically(ssn *framework.Session, queue *api.QueueInfo, job *api.JobInfo) bool { | ||
| // If hierarchical queue is not enabled, list will only contain the queue itself. | ||
| list := append(cp.queueOpts[queue.UID].ancestors, queue.UID) | ||
| // Do not append onto the stored ancestors slice: that could mutate its shared | ||
| // backing array. Copy into a fresh slice instead. | ||
| ancestors := cp.queueOpts[queue.UID].ancestors |
| cp.queueOpts[uid] = cp.newQueueAttr(ssn.Queues[uid]) | ||
| if err := cp.updateAncestors(ssn.Queues[uid], ssn, map[api.QueueID]struct{}{}); err != nil { | ||
| t.Fatalf("updateAncestors(%s) returned error: %v", s.name, err) | ||
| } | ||
| } |
Summary
Fix a slice-aliasing issue in hierarchical queue ancestor construction that could corrupt ancestor chains in deep queue hierarchies, leading to incorrect quota enforcement and resource accounting.
Problem
updateAncestors()built a child queue's ancestor chain using:When the parent ancestor slice had spare capacity, Go reused the same backing array. As a result, sibling subtrees could unintentionally share memory, allowing later updates to overwrite ancestor entries belonging to other queues.
This could cause:
Root Cause
The ancestor list stored in
queueOptswas reused viaappend(), which allowed sibling queues to alias the same backing array.Since hierarchical capacity and enqueue checks rely on ancestor chains, corrupted ancestry could propagate into scheduling decisions.
Fix
Ancestor Construction
append()usage inupdateAncestors()with explicit slice allocation and copy.Defensive Hardening
Avoid mutating stored ancestor slices in:
checkQueueAllocatableHierarchically()checkJobEnqueueableHierarchically()Both paths now construct temporary lists using fresh slices before appending queue IDs.
Testing
Added regression test:
Test_updateAncestors_noSliceAliasingThe test:
The test reproduces the bug on the old implementation and passes with the fix.
Verification
go vetpasses successfullyImpact
This change guarantees that ancestor chains remain isolated and immutable across sibling subtrees, restoring correct hierarchical quota enforcement, enqueue validation, and resource accounting for deep queue hierarchies while preserving existing behavior for shallow hierarchies.