Return user error when cluster identity role assignments are missing#4635
Return user error when cluster identity role assignments are missing#4635
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: slawande2 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
rajdeepc2792
left a comment
There was a problem hiding this comment.
Provided some insights, the functionality change looks good and must pass in the local testing.
| }, | ||
| { | ||
| name: "Fail - An action is denied for a platform identity", | ||
| platformIdentities: platformWorkloadIdentities, |
There was a problem hiding this comment.
How this change is related to the functional change in the PR? I believe the test would pass without this change?
There was a problem hiding this comment.
This was just a test hygiene improvement, not functionally required, but it would remove ambiguity about which identity triggers the error, making the assertion on the new detailed error message reliable and clear.
Since the role name will be removed from the error message per your other comment in platformworkloadidentityprofile.go, the wantErr will no longer depend on which map entry is iterated first, so reducing to a single-entry map is unnecessary. I'll revert it.
| if err == wait.ErrWaitTimeout { | ||
| return api.NewCloudError( | ||
| http.StatusBadRequest, | ||
| api.CloudErrorCodeInvalidWorkloadIdentityPermissions, |
There was a problem hiding this comment.
Instead of suggesting invalidWorkloadIdentityPermissions, can we define a new Code with ClusterMSI or ClusterManagedIdentity Permissions?
| api.CloudErrorCodeInvalidWorkloadIdentityPermissions, | ||
| "", | ||
| fmt.Sprintf( | ||
| "The cluster user assigned identity does not have required permissions on platform workload identity '%s' used for role '%s'.", |
There was a problem hiding this comment.
I believe the text "used for role '%s'" doesn't provide any useful additional information for the user. The workload identity information is sufficient.
Which issue this PR addresses:
Fixes ARO-24371
What this PR does / why we need it:
Currently, when the cluster MSI is missing role assignments over platform workload identities, ValidateClusterUserAssignedIdentity() polls CheckAccess v2 for up to 6 minutes and returns a raw wait.ErrWaitTimeout, which surfaces as a 500 internal server error.
We already handle this correctly for identity → VNet role assignments by converting the timeout to a 400 CloudError. This PR applies the same pattern to the cluster identity → platform workload identity case, returning a 400 InvalidWorkloadIdentityPermissions error that tells the customer which identity and role are missing permissions.
Test plan for issue:
Updated two existing unit tests in TestValidateClusterUserAssignedIdentity to verify the new CloudError is returned instead of the timeout error.
Is there any documentation that needs to be updated for this PR?
no