Skip to content

Fix isDependsOnPodsReady to treat Succeeded pods as ready#5439

Open
avinxshKD wants to merge 1 commit into
volcano-sh:masterfrom
avinxshKD:fix/dependsOn-PodSucceeded
Open

Fix isDependsOnPodsReady to treat Succeeded pods as ready#5439
avinxshKD wants to merge 1 commit into
volcano-sh:masterfrom
avinxshKD:fix/dependsOn-PodSucceeded

Conversation

@avinxshKD

Copy link
Copy Markdown
Contributor

What type of PR is this?

/kind bug

What this PR does / why we need it:

Fixes a workflow-breaking bug where a task with a dependsOn dependency never starts if the dependency task completes successfully.

isDependsOnPodsReady was running a containerStatus.Ready check on completed pods. In Kubernetes, containers in a PodSucceeded state are terminated, meaning Ready is always false. Bypassed the container ready loop for pods in v1.PodSucceeded phase.

Which issue(s) this PR fixes:

Fixes #5424

Special notes for your reviewer:

Included table-driven tests in job_controller_actions_test.go covering PodSucceeded, PodRunning (ready), and PodRunning (not ready) dependency states.

Does this PR introduce a user-facing change?

Fix: `dependsOn` tasks now correctly execute when their dependency tasks reach a completed (Succeeded) state instead of stalling indefinitely.

@volcano-sh-bot volcano-sh-bot added the kind/bug Categorizes issue or PR as related to a bug. label Jun 12, 2026
@volcano-sh-bot

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign lowang-bh for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jun 12, 2026
@avinxshKD

Copy link
Copy Markdown
Contributor Author

kept it minimal, pls take a look @hzxuzhonghu @wangyang0616

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the isDependsOnPodsReady function in the job controller to treat succeeded pods as ready by incrementing the running pod count. It also introduces a comprehensive unit test suite (TestIsDependsOnPodsReady) to verify this behavior. Feedback was provided on the unit tests to use t.Fatalf instead of t.Error for setup failures to prevent cascading errors and to include the actual error message in the logs for easier debugging.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread pkg/controllers/job/job_controller_actions_test.go
Signed-off-by: Avinash Kumar Deepak <avinash8655279@gmail.com>
@avinxshKD avinxshKD force-pushed the fix/dependsOn-PodSucceeded branch from aedf7bf to 08dc71d Compare June 12, 2026 09:35
@avinxshKD

Copy link
Copy Markdown
Contributor Author

the E2E sequence queue tests appear to be flaking during DeferCleanup. Since this PR only touches the dependsOn state logic in the job controller, I believe it's unrelated to my changes.

@hzxuzhonghu

Copy link
Copy Markdown
Member

LGTM
Would want @hwdef to take a look, if there is other consideration

@avinxshKD

Copy link
Copy Markdown
Contributor Author

@hwdef pls take a look when get chance, thanks

cc @hzxuzhonghu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bug Categorizes issue or PR as related to a bug. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

dependsOn: task with PodSucceeded dependency never starts because isDependsOnPodsReady fails the containerStatus.Ready check on completed pods

3 participants