Skip to content

HYPERFLEET-657 - fix: enforce mandatory conditions in adapter status#60

Open
yasun1 wants to merge 1 commit intoopenshift-hyperfleet:mainfrom
yasun1:HYPERFLEET-657
Open

HYPERFLEET-657 - fix: enforce mandatory conditions in adapter status#60
yasun1 wants to merge 1 commit intoopenshift-hyperfleet:mainfrom
yasun1:HYPERFLEET-657

Conversation

@yasun1
Copy link
Contributor

@yasun1 yasun1 commented Feb 28, 2026

Reject adapter status updates missing mandatory conditions (Available, Applied, Health) or containing Unknown status. Returns HTTP 204 and preserves existing status to prevent loss of critical cluster state.

Changes:

  • Add condition type constants in pkg/api/status_types.go
  • Validate mandatory conditions in cluster/nodepool services
  • Replace hardcoded condition strings with constants
  • Add 4 new tests, update 8 existing tests for validation
  • Update integration tests to use constants

All tests passing: 416 unit, 51 integration, 0 lint issues

Summary by CodeRabbit

  • New Features

    • Added standard public condition identifiers: Available, Applied, Health, Ready.
  • Bug Fixes

    • Enforced mandatory-condition validation: updates missing required conditions are discarded.
    • First report may include Unknown Available (accepted but skips aggregation); subsequent Unknown Available reports are rejected (no-op).
    • Adapter updates use complete-replacement semantics; aggregation runs only when an actionable Available condition is present; aggregation failures are logged.
  • Tests

    • Expanded and renamed tests to cover mandatory-condition validation, Unknown handling, preservation of existing statuses, and aggregation behavior.

@openshift-ci openshift-ci bot requested review from AlexVulaj and tirthct February 28, 2026 08:31
@coderabbitai
Copy link

coderabbitai bot commented Feb 28, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

This PR adds new public condition type constants (ConditionTypeAvailable, ConditionTypeApplied, ConditionTypeHealth, ConditionTypeReady) in pkg/api/status_types.go, introduces ValidateMandatoryConditions and a mandatoryConditionTypes list in pkg/services/status_aggregation.go, and updates cluster and node-pool adapter status processing to validate mandatory conditions (Available, Applied, Health). Adapter status upserts now use full-replacement semantics; updates missing mandatory conditions or containing non-actionable (Unknown) Available reports are discarded (first Unknown may be accepted but aggregation skipped; subsequent Unknown Available reports are rejected). Tests and integration tests were updated to use the new API constants and to cover the revised validation and Unknown-handling behaviors.

Sequence Diagram(s)

sequenceDiagram
    participant Adapter
    participant Service as Cluster/NodePool Service
    participant Storage as Status Storage
    participant Aggregation as Status Aggregation

    Adapter->>Service: POST adapter status (conditions)

    rect rgba(200,100,100,0.5)
        Service->>Service: Build condition map
        Service->>Service: ValidateMandatoryConditions(Available, Applied, Health)
        alt Missing mandatory OR disallowed Unknown (subsequent Unknown Available)
            Service->>Service: Log rejection reason
            Service-->>Adapter: 204 No Content (discarded)
        else First report with Available=Unknown
            Service->>Service: Accept and store adapter status (full replace)
            Service-->>Adapter: 201 Created (aggregation skipped)
        end
    end

    rect rgba(100,200,100,0.5)
        alt Valid (all mandatory present and actionable)
            Service->>Storage: Upsert adapter status (full replace)
            Service->>Aggregation: Trigger status aggregation
            Aggregation->>Aggregation: Compute aggregated conditions (Available/Ready)
            Aggregation-->>Service: Aggregation result
            Service-->>Adapter: 200/201 Success
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested labels

lgtm, approved

Suggested reviewers

  • AlexVulaj
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 67.39% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: enforcing mandatory conditions in adapter status, which aligns with the core objective of validating Available, Applied, and Health conditions.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (4)
pkg/services/cluster_test.go (2)

943-946: Inconsistent use of string literals for condition types.

Lines 944, 1024, 1071, 1130, and 1156 use the string literal "Available" instead of api.ConditionTypeAvailable. For consistency with the rest of the file, consider using the API constants throughout.

♻️ Suggested fix for line 944 (apply similar fixes to other occurrences)
 	validConditions := []api.AdapterCondition{
-		{Type: "Available", Status: api.AdapterConditionTrue, LastTransitionTime: time.Now()},
+		{Type: api.ConditionTypeAvailable, Status: api.AdapterConditionTrue, LastTransitionTime: time.Now()},
 		{Type: api.ConditionTypeApplied, Status: api.AdapterConditionTrue, LastTransitionTime: time.Now()},
 		{Type: api.ConditionTypeHealth, Status: api.AdapterConditionTrue, LastTransitionTime: time.Now()},
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/services/cluster_test.go` around lines 943 - 946, Replace the hard-coded
string "Available" used for AdapterCondition Type with the API constant to
maintain consistency: update the entries in validConditions (and the other
occurrences at the noted locations) to use api.ConditionTypeAvailable instead of
the string literal; ensure the type field remains of the same type
(api.AdapterCondition.Type) and run tests to verify no type mismatches.

857-864: Inconsistent use of string literal instead of API constant.

Line 858 uses the string literal "Ready" while the rest of the file uses api.ConditionTypeReady. For consistency, consider using the API constant.

♻️ Suggested fix
 		{
-			Type:               "Ready",
+			Type:               api.ConditionTypeReady,
 			Status:             api.ConditionFalse,
 			ObservedGeneration: 1,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/services/cluster_test.go` around lines 857 - 864, Replace the hard-coded
string "Ready" in the Condition struct literal with the API constant to maintain
consistency: change the Type field value from the string literal to use
api.ConditionTypeReady (the same constant used elsewhere in this file/tests),
ensuring imports/reference to api remain correct.
pkg/services/node_pool.go (1)

269-292: Validation logic is correct; consider structured logging improvement.

The mandatory condition validation correctly mirrors the cluster.go implementation. However, similar to the cluster service, consider using structured logging fields instead of fmt.Sprintf.

♻️ Suggested refactor for structured logging
 	for _, mandatoryType := range mandatoryConditionTypes {
 		status, exists := conditionMap[mandatoryType]
 		if !exists {
 			// Missing mandatory condition - discard the update
-			logger.With(ctx, logger.FieldNodePoolID, nodePoolID).
-				Info(fmt.Sprintf("Discarding adapter status update from %s: missing mandatory condition %s",
-					adapterStatus.Adapter, mandatoryType))
+			logger.With(ctx, logger.FieldNodePoolID, nodePoolID, "adapter", adapterStatus.Adapter, "missing_condition", mandatoryType).
+				Info("Discarding adapter status update: missing mandatory condition")
 			return nil, nil
 		}
 		if status == api.AdapterConditionUnknown {
 			// Mandatory condition has Unknown status - discard the update
-			logger.With(ctx, logger.FieldNodePoolID, nodePoolID).
-				Info(fmt.Sprintf("Discarding adapter status update from %s: mandatory condition %s has Unknown status",
-					adapterStatus.Adapter, mandatoryType))
+			logger.With(ctx, logger.FieldNodePoolID, nodePoolID, "adapter", adapterStatus.Adapter, "condition", mandatoryType).
+				Info("Discarding adapter status update: mandatory condition has Unknown status")
 			return nil, nil
 		}
 	}

Based on learnings: "In the openshift-hyperfleet/hyperfleet-api repository, NewOCMLogger(ctx) is deprecated. Switch to the slog-based logger and use logger.With(ctx, ...) to produce structured, context-aware logs."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/services/node_pool.go` around lines 269 - 292, The logging uses
fmt.Sprintf for messages in the mandatory condition checks; replace these with
structured, context-aware logs using logger.With(ctx, logger.FieldNodePoolID,
nodePoolID, "adapter", adapterStatus.Adapter, "mandatoryCondition",
mandatoryType, "status", status) and then call Info with a short message (e.g.,
"Discarding adapter status update") so fields carry the details; update both the
missing-condition and Unknown-status branches and keep the validation logic
around conditionMap, mandatoryConditionTypes, and adapterStatus.Adapter
unchanged.
pkg/services/cluster.go (1)

272-295: Consider using structured logging fields instead of fmt.Sprintf.

The current approach uses fmt.Sprintf to format the log message. Per repository conventions, structured logging with fields is preferred for better log parsing and filtering.

♻️ Suggested refactor for structured logging
 	for _, mandatoryType := range mandatoryConditionTypes {
 		status, exists := conditionMap[mandatoryType]
 		if !exists {
 			// Missing mandatory condition - discard the update
 			ctx = logger.WithClusterID(ctx, clusterID)
-			logger.Info(ctx, fmt.Sprintf("Discarding adapter status update from %s: missing mandatory condition %s",
-				adapterStatus.Adapter, mandatoryType))
+			logger.With(ctx, "adapter", adapterStatus.Adapter, "missing_condition", mandatoryType).
+				Info("Discarding adapter status update: missing mandatory condition")
 			return nil, nil
 		}
 		if status == api.AdapterConditionUnknown {
 			// Mandatory condition has Unknown status - discard the update
 			ctx = logger.WithClusterID(ctx, clusterID)
-			logger.Info(ctx, fmt.Sprintf("Discarding adapter status update from %s: mandatory condition %s has Unknown status",
-				adapterStatus.Adapter, mandatoryType))
+			logger.With(ctx, "adapter", adapterStatus.Adapter, "condition", mandatoryType).
+				Info("Discarding adapter status update: mandatory condition has Unknown status")
 			return nil, nil
 		}
 	}

Based on learnings: "In the openshift-hyperfleet/hyperfleet-api repository, NewOCMLogger(ctx) is deprecated. Switch to the slog-based logger and use logger.With(ctx, ...) to produce structured, context-aware logs."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/services/cluster.go` around lines 272 - 295, Replace the
fmt.Sprintf-based logs with structured slog-style logging using logger.With(...)
and pass clusterID, adapter and mandatory condition as fields; for the
missing-condition case call something like ctx = logger.With(ctx, "cluster_id",
clusterID, "adapter", adapterStatus.Adapter, "mandatory_condition",
mandatoryType) and then logger.Info(ctx, "Discarding adapter status update:
missing mandatory condition"); do the same for the Unknown-status branch (use
the same fields and a message like "Discarding adapter status update: mandatory
condition has Unknown status"), removing fmt.Sprintf and keeping the existing
early returns; references: conditionMap, mandatoryConditionTypes,
logger.WithClusterID / logger.With, logger.Info, adapterStatus.Adapter,
mandatoryType.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/services/cluster_test.go`:
- Around line 943-946: Replace the hard-coded string "Available" used for
AdapterCondition Type with the API constant to maintain consistency: update the
entries in validConditions (and the other occurrences at the noted locations) to
use api.ConditionTypeAvailable instead of the string literal; ensure the type
field remains of the same type (api.AdapterCondition.Type) and run tests to
verify no type mismatches.
- Around line 857-864: Replace the hard-coded string "Ready" in the Condition
struct literal with the API constant to maintain consistency: change the Type
field value from the string literal to use api.ConditionTypeReady (the same
constant used elsewhere in this file/tests), ensuring imports/reference to api
remain correct.

In `@pkg/services/cluster.go`:
- Around line 272-295: Replace the fmt.Sprintf-based logs with structured
slog-style logging using logger.With(...) and pass clusterID, adapter and
mandatory condition as fields; for the missing-condition case call something
like ctx = logger.With(ctx, "cluster_id", clusterID, "adapter",
adapterStatus.Adapter, "mandatory_condition", mandatoryType) and then
logger.Info(ctx, "Discarding adapter status update: missing mandatory
condition"); do the same for the Unknown-status branch (use the same fields and
a message like "Discarding adapter status update: mandatory condition has
Unknown status"), removing fmt.Sprintf and keeping the existing early returns;
references: conditionMap, mandatoryConditionTypes, logger.WithClusterID /
logger.With, logger.Info, adapterStatus.Adapter, mandatoryType.

In `@pkg/services/node_pool.go`:
- Around line 269-292: The logging uses fmt.Sprintf for messages in the
mandatory condition checks; replace these with structured, context-aware logs
using logger.With(ctx, logger.FieldNodePoolID, nodePoolID, "adapter",
adapterStatus.Adapter, "mandatoryCondition", mandatoryType, "status", status)
and then call Info with a short message (e.g., "Discarding adapter status
update") so fields carry the details; update both the missing-condition and
Unknown-status branches and keep the validation logic around conditionMap,
mandatoryConditionTypes, and adapterStatus.Adapter unchanged.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ab950d5 and e277266.

📒 Files selected for processing (7)
  • pkg/api/status_types.go
  • pkg/services/cluster.go
  • pkg/services/cluster_test.go
  • pkg/services/node_pool.go
  • pkg/services/node_pool_test.go
  • pkg/services/status_aggregation.go
  • test/integration/adapter_status_test.go

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
pkg/services/cluster.go (1)

33-35: ⚠️ Potential issue | 🟡 Minor

Interface comment is stale after behavior change.

Line 34 and Line 35 still describe the old “Unknown Available” flow only. The interface docs should mention mandatory-condition enforcement (Available/Applied/Health) and rejection semantics.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/services/cluster.go` around lines 33 - 35, Update the stale interface
comment for ProcessAdapterStatus to reflect current behavior: state that
ProcessAdapterStatus enforces mandatory conditions (Available, Applied, Health),
rejects/returns an error if any mandatory condition is missing or invalid, and
otherwise upserts the status and triggers aggregation; remove or replace the old
“Unknown Available” only description and explicitly document the rejection
semantics (what conditions cause an error vs no-op) and the return values (e.g.,
(nil, nil) only used for explicit no-op cases).
pkg/services/node_pool.go (1)

33-35: ⚠️ Potential issue | 🟡 Minor

ProcessAdapterStatus comments are outdated.

Line 34, Line 35, and Line 242-Line 244 still describe the old Unknown-handling path. They should be updated to reflect mandatory-condition validation and discard-on-missing/Unknown behavior.

Also applies to: 241-244

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/services/node_pool.go` around lines 33 - 35, Update the outdated comments
for ProcessAdapterStatus to describe the new mandatory-condition validation and
discard behavior: state that the function validates a required condition
(instead of treating "Unknown" as a no-op), and that if the mandatory condition
is missing or set to Unknown the status is discarded (not returned as nil,nil),
otherwise it upserts the status and triggers aggregation; also update the
related comments near the aggregation/upsert section (previously lines
describing the Unknown-handling path) to reflect this discard-on-missing/Unknown
rule so documentation matches the current logic in ProcessAdapterStatus and any
helper methods it calls.
🧹 Nitpick comments (1)
pkg/services/cluster.go (1)

273-280: Prefer structured log fields over fmt.Sprintf messages.

Lines 273 and 279 use dynamic strings with fmt.Sprintf, which reduces log queryability. Use logger.With(ctx, ...) to add structured fields (adapter, missing_condition, unknown_condition) with a static message instead.

Refactor pattern
-		logger.Info(ctx, fmt.Sprintf("Discarding adapter status update from %s: missing mandatory condition %s",
-			adapterStatus.Adapter, missingCondition))
+		logger.With(ctx,
+			"adapter", adapterStatus.Adapter,
+			"missing_condition", missingCondition,
+		).Info("Discarding adapter status update: missing mandatory condition")
 		return nil, nil
 	} else if unknownCondition != "" {
 		// Mandatory condition has Unknown status - discard the update
 		ctx = logger.WithClusterID(ctx, clusterID)
-		logger.Info(ctx, fmt.Sprintf("Discarding adapter status update from %s: mandatory condition %s has Unknown status",
-			adapterStatus.Adapter, unknownCondition))
+		logger.With(ctx,
+			"adapter", adapterStatus.Adapter,
+			"unknown_condition", unknownCondition,
+		).Info("Discarding adapter status update: mandatory condition has Unknown status")
 		return nil, nil
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/services/cluster.go` around lines 273 - 280, Replace the
fmt.Sprintf-based Info logs in the adapter status discard branches with
structured fields: in the missing mandatory condition branch (where
adapterStatus.Adapter and missingCondition are used) and in the unknown
condition branch (where adapterStatus.Adapter and unknownCondition are used),
call logger.With(ctx, ...) to add fields like "adapter": adapterStatus.Adapter,
"missing_condition": missingCondition or "unknown_condition": unknownCondition,
keep the static message (e.g., "Discarding adapter status update: missing
mandatory condition" / "Discarding adapter status update: mandatory condition
has Unknown status"), preserve the cluster-scoped context by using
logger.WithClusterID(ctx, clusterID) as before, and then call logger.Info with
the static message and the enriched context.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/services/status_aggregation.go`:
- Around line 28-30: The docstring uses the name hasUnknown but the function
actually returns unknownCondition string; update the comment to match the actual
return variable name (unknownCondition) or rename the return variable to
hasUnknown for consistency; locate the function that declares the return
variable unknownCondition string in pkg/services/status_aggregation.go and
either change the doc comment to describe unknownCondition (the first mandatory
condition with Unknown status) or rename the return identifier to hasUnknown and
update all references accordingly.
- Around line 40-54: The current build of conditionMap overwrites earlier
statuses so a later True can hide an earlier Unknown; when populating
conditionMap in the loop over conditions, preserve any Unknown status for a
given cond.Type (i.e., treat Unknown as the highest-severity value) instead of
blindly assigning cond.Status; implement this by checking existing :=
conditionMap[cond.Type] and if existing == api.AdapterConditionUnknown continue
(or if cond.Status == api.AdapterConditionUnknown set conditionMap[cond.Type] =
api.AdapterConditionUnknown), so validation using mandatoryConditionTypes and
AdapterConditionUnknown correctly fails when any duplicate reported Unknown.

---

Outside diff comments:
In `@pkg/services/cluster.go`:
- Around line 33-35: Update the stale interface comment for ProcessAdapterStatus
to reflect current behavior: state that ProcessAdapterStatus enforces mandatory
conditions (Available, Applied, Health), rejects/returns an error if any
mandatory condition is missing or invalid, and otherwise upserts the status and
triggers aggregation; remove or replace the old “Unknown Available” only
description and explicitly document the rejection semantics (what conditions
cause an error vs no-op) and the return values (e.g., (nil, nil) only used for
explicit no-op cases).

In `@pkg/services/node_pool.go`:
- Around line 33-35: Update the outdated comments for ProcessAdapterStatus to
describe the new mandatory-condition validation and discard behavior: state that
the function validates a required condition (instead of treating "Unknown" as a
no-op), and that if the mandatory condition is missing or set to Unknown the
status is discarded (not returned as nil,nil), otherwise it upserts the status
and triggers aggregation; also update the related comments near the
aggregation/upsert section (previously lines describing the Unknown-handling
path) to reflect this discard-on-missing/Unknown rule so documentation matches
the current logic in ProcessAdapterStatus and any helper methods it calls.

---

Nitpick comments:
In `@pkg/services/cluster.go`:
- Around line 273-280: Replace the fmt.Sprintf-based Info logs in the adapter
status discard branches with structured fields: in the missing mandatory
condition branch (where adapterStatus.Adapter and missingCondition are used) and
in the unknown condition branch (where adapterStatus.Adapter and
unknownCondition are used), call logger.With(ctx, ...) to add fields like
"adapter": adapterStatus.Adapter, "missing_condition": missingCondition or
"unknown_condition": unknownCondition, keep the static message (e.g.,
"Discarding adapter status update: missing mandatory condition" / "Discarding
adapter status update: mandatory condition has Unknown status"), preserve the
cluster-scoped context by using logger.WithClusterID(ctx, clusterID) as before,
and then call logger.Info with the static message and the enriched context.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e277266 and 57ff129.

📒 Files selected for processing (8)
  • pkg/api/status_types.go
  • pkg/services/cluster.go
  • pkg/services/cluster_test.go
  • pkg/services/node_pool.go
  • pkg/services/node_pool_test.go
  • pkg/services/status_aggregation.go
  • pkg/services/status_aggregation_test.go
  • test/integration/adapter_status_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • pkg/api/status_types.go
  • pkg/services/node_pool_test.go

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
pkg/services/cluster_test.go (1)

499-499: Replace remaining hardcoded condition type literals with API constants.

Line 499/562/846/932/984/1013/1072/1098 still use raw "Ready"/"Available" strings. Prefer api.ConditionTypeReady and api.ConditionTypeAvailable for consistency and to avoid future drift.

Also applies to: 562-562, 846-846, 932-932, 984-984, 1013-1013, 1072-1072, 1098-1098

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/services/cluster_test.go` at line 499, Replace the hardcoded condition
type string literals "Ready" and "Available" with the API constants
api.ConditionTypeReady and api.ConditionTypeAvailable in the test file; locate
the Condition entries (e.g., the struct field "Type:" used in the test cases in
cluster_test.go) and update those Type: "Ready"/"Available" occurrences to Type:
api.ConditionTypeReady/Type: api.ConditionTypeAvailable respectively so all
seven instances use the API constants.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/services/cluster_test.go`:
- Line 499: Replace the hardcoded condition type string literals "Ready" and
"Available" with the API constants api.ConditionTypeReady and
api.ConditionTypeAvailable in the test file; locate the Condition entries (e.g.,
the struct field "Type:" used in the test cases in cluster_test.go) and update
those Type: "Ready"/"Available" occurrences to Type:
api.ConditionTypeReady/Type: api.ConditionTypeAvailable respectively so all
seven instances use the API constants.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 57ff129 and b152dd8.

📒 Files selected for processing (8)
  • pkg/api/status_types.go
  • pkg/services/cluster.go
  • pkg/services/cluster_test.go
  • pkg/services/node_pool.go
  • pkg/services/node_pool_test.go
  • pkg/services/status_aggregation.go
  • pkg/services/status_aggregation_test.go
  • test/integration/adapter_status_test.go

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
pkg/services/node_pool.go (1)

269-281: Prefer fielded logging for rejection reasons.

Using structured fields here will make filtering by adapter/condition much easier in production logs.

♻️ Suggested refactor
-		logger.With(ctx, logger.FieldNodePoolID, nodePoolID).
-			Info(fmt.Sprintf("Discarding adapter status update from %s: missing mandatory condition %s",
-				adapterStatus.Adapter, missingCondition))
+		logger.With(ctx,
+			logger.FieldNodePoolID, nodePoolID,
+			"adapter", adapterStatus.Adapter,
+			"missing_condition", missingCondition,
+		).Info("Discarding adapter status update: missing mandatory condition")
 		return nil, nil
 	} else if unknownCondition != "" {
 		// Mandatory condition has Unknown status - discard the update
-		logger.With(ctx, logger.FieldNodePoolID, nodePoolID).
-			Info(fmt.Sprintf("Discarding adapter status update from %s: mandatory condition %s has Unknown status",
-				adapterStatus.Adapter, unknownCondition))
+		logger.With(ctx,
+			logger.FieldNodePoolID, nodePoolID,
+			"adapter", adapterStatus.Adapter,
+			"unknown_condition", unknownCondition,
+		).Info("Discarding adapter status update: mandatory condition is Unknown")
 		return nil, nil
 	}

Based on learnings: In the openshift-hyperfleet/hyperfleet-api repository, NewOCMLogger(ctx) is deprecated. Switch to the slog-based logger and use logger.With(ctx, ...) to produce structured, context-aware logs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/services/node_pool.go` around lines 269 - 281, Replace the
string-constructed logs in the ValidateMandatoryConditions handling with
structured/fielded logging: in the branch where ValidateMandatoryConditions
returns missingCondition, call logger.With(ctx, logger.FieldNodePoolID,
nodePoolID, "adapter", adapterStatus.Adapter, "missing_condition",
missingCondition).Info("Discarding adapter status update: missing mandatory
condition") and in the unknownCondition branch call logger.With(ctx,
logger.FieldNodePoolID, nodePoolID, "adapter", adapterStatus.Adapter,
"unknown_condition", unknownCondition).Info("Discarding adapter status update:
mandatory condition has Unknown status"); keep the same return nil, nil behavior
and the ValidateMandatoryConditions, adapterStatus.Adapter, missingCondition and
unknownCondition symbols to locate the code.
pkg/services/cluster_test.go (1)

932-933: Use api.ConditionTypeAvailable/api.ConditionTypeReady consistently in tests.

A few new/changed assertions still use raw string literals; switching them to constants reduces drift risk if condition names evolve.

Also applies to: 984-985, 1013-1013, 1072-1072, 1098-1098

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/services/cluster_test.go` around lines 932 - 933, Replace hardcoded
condition name string literals in test condition structs with the appropriate
api constants to avoid drift: change occurrences like {Type: "Available", ...}
or other raw "Ready"/"Available" strings to use api.ConditionTypeAvailable (or
api.ConditionTypeReady where intended), keeping existing uses of
api.ConditionTypeApplied as-is; update all similar occurrences (the ones near
the shown diff and the other assertions mentioned) so every condition Type uses
the api.ConditionType... constants.
pkg/services/cluster.go (1)

269-281: Use structured fields for discard logs instead of fmt.Sprintf.

This keeps adapter/condition dimensions queryable and consistent with the current logging approach.

♻️ Suggested refactor
-		ctx = logger.WithClusterID(ctx, clusterID)
-		logger.Info(ctx, fmt.Sprintf("Discarding adapter status update from %s: missing mandatory condition %s",
-			adapterStatus.Adapter, missingCondition))
+		ctx = logger.WithClusterID(ctx, clusterID)
+		logger.With(ctx,
+			"adapter", adapterStatus.Adapter,
+			"missing_condition", missingCondition,
+		).Info("Discarding adapter status update: missing mandatory condition")
 		return nil, nil
 	} else if unknownCondition != "" {
 		// Mandatory condition has Unknown status - discard the update
-		ctx = logger.WithClusterID(ctx, clusterID)
-		logger.Info(ctx, fmt.Sprintf("Discarding adapter status update from %s: mandatory condition %s has Unknown status",
-			adapterStatus.Adapter, unknownCondition))
+		ctx = logger.WithClusterID(ctx, clusterID)
+		logger.With(ctx,
+			"adapter", adapterStatus.Adapter,
+			"unknown_condition", unknownCondition,
+		).Info("Discarding adapter status update: mandatory condition is Unknown")
 		return nil, nil
 	}

Based on learnings: In the openshift-hyperfleet/hyperfleet-api repository, NewOCMLogger(ctx) is deprecated. Switch to the slog-based logger and use logger.With(ctx, ...) to produce structured, context-aware logs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/services/cluster.go` around lines 269 - 281, Replace the
fmt.Sprintf-based Info logs with structured slog-style logs: after setting ctx =
logger.WithClusterID(ctx, clusterID) use logger.With(ctx, "adapter",
adapterStatus.Adapter, "condition", missingCondition) and call logger.Info(ctx,
"Discarding adapter status update: missing mandatory condition") for the
ValidateMandatoryConditions missingCondition branch; likewise use
logger.With(ctx, "adapter", adapterStatus.Adapter, "condition",
unknownCondition) and logger.Info(ctx, "Discarding adapter status update:
mandatory condition has Unknown status") for the unknownCondition branch
(reference functions/vars: ValidateMandatoryConditions, adapterStatus.Adapter,
missingCondition, unknownCondition, logger.WithClusterID, logger.With,
logger.Info).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/services/cluster_test.go`:
- Around line 932-933: Replace hardcoded condition name string literals in test
condition structs with the appropriate api constants to avoid drift: change
occurrences like {Type: "Available", ...} or other raw "Ready"/"Available"
strings to use api.ConditionTypeAvailable (or api.ConditionTypeReady where
intended), keeping existing uses of api.ConditionTypeApplied as-is; update all
similar occurrences (the ones near the shown diff and the other assertions
mentioned) so every condition Type uses the api.ConditionType... constants.

In `@pkg/services/cluster.go`:
- Around line 269-281: Replace the fmt.Sprintf-based Info logs with structured
slog-style logs: after setting ctx = logger.WithClusterID(ctx, clusterID) use
logger.With(ctx, "adapter", adapterStatus.Adapter, "condition",
missingCondition) and call logger.Info(ctx, "Discarding adapter status update:
missing mandatory condition") for the ValidateMandatoryConditions
missingCondition branch; likewise use logger.With(ctx, "adapter",
adapterStatus.Adapter, "condition", unknownCondition) and logger.Info(ctx,
"Discarding adapter status update: mandatory condition has Unknown status") for
the unknownCondition branch (reference functions/vars:
ValidateMandatoryConditions, adapterStatus.Adapter, missingCondition,
unknownCondition, logger.WithClusterID, logger.With, logger.Info).

In `@pkg/services/node_pool.go`:
- Around line 269-281: Replace the string-constructed logs in the
ValidateMandatoryConditions handling with structured/fielded logging: in the
branch where ValidateMandatoryConditions returns missingCondition, call
logger.With(ctx, logger.FieldNodePoolID, nodePoolID, "adapter",
adapterStatus.Adapter, "missing_condition", missingCondition).Info("Discarding
adapter status update: missing mandatory condition") and in the unknownCondition
branch call logger.With(ctx, logger.FieldNodePoolID, nodePoolID, "adapter",
adapterStatus.Adapter, "unknown_condition", unknownCondition).Info("Discarding
adapter status update: mandatory condition has Unknown status"); keep the same
return nil, nil behavior and the ValidateMandatoryConditions,
adapterStatus.Adapter, missingCondition and unknownCondition symbols to locate
the code.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b152dd8 and d5df0d0.

📒 Files selected for processing (8)
  • pkg/api/status_types.go
  • pkg/services/cluster.go
  • pkg/services/cluster_test.go
  • pkg/services/node_pool.go
  • pkg/services/node_pool_test.go
  • pkg/services/status_aggregation.go
  • pkg/services/status_aggregation_test.go
  • test/integration/adapter_status_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/api/status_types.go

logger.Info(ctx, fmt.Sprintf("Discarding adapter status update from %s: missing mandatory condition %s",
adapterStatus.Adapter, missingCondition))
return nil, nil
} else if unknownCondition != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm questioning ...

  • Should only enforce the Unknown check for the Available condition?

Available is the only meaningful condition for our aggregate computation (until we allow to be configurable)

  • If is that important
  • Should other conditions affect if we accept the status report?

This also leds to the question... if we allow in the future the customization of the condition aggregation... should we also offer customizability of status report accept/reject? (seems adding too much complexity though)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This update is based on the Acceptance criteria of HYPERFLEET-657:

    status responses without mandatory conditions should be discarded
    status responses with mandatory conditions should be taken into account
        If values for conditions status are other than Unknown
        Any status report with any of the mandatory conditions with status Unknown will be discarded (and therefore retried a reconciliation later)

Your point let me rethinking that as Available is the only meaningful condition for our aggregate computation, currently, it is better to only do the smallest changes, to only check for Unknown in the Available condition, so we can still preserve the states of other conditions (Applied, Health) for diagnostics and tracing.

If in the future, we support the customization of the condition aggregation, the accept/reject logic for Unknown status should be based on the same configuration, for example, the cluster-level Available is based on the adapter-level Health=true, Available=true, and so on.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/services/cluster_test.go (1)

227-233: ⚠️ Potential issue | 🟡 Minor

“Subsequent Unknown” tests can pass for the wrong reason.

These payloads omit Applied/Health, so rejection can happen due to missing mandatory conditions rather than the Unknown-Available path the test names describe.

✅ Tighten branch coverage
 existingConditions := []api.AdapterCondition{
-	{Type: api.ConditionTypeAvailable, Status: api.AdapterConditionUnknown, LastTransitionTime: time.Now()},
+	{Type: api.ConditionTypeAvailable, Status: api.AdapterConditionUnknown, LastTransitionTime: time.Now()},
+	{Type: api.ConditionTypeApplied, Status: api.AdapterConditionTrue, LastTransitionTime: time.Now()},
+	{Type: api.ConditionTypeHealth, Status: api.AdapterConditionTrue, LastTransitionTime: time.Now()},
 }
 conditions := []api.AdapterCondition{
 	{Type: api.ConditionTypeReady, Status: api.AdapterConditionTrue, LastTransitionTime: time.Now()},
 	{Type: api.ConditionTypeAvailable, Status: api.AdapterConditionUnknown, LastTransitionTime: time.Now()},
+	{Type: api.ConditionTypeApplied, Status: api.AdapterConditionTrue, LastTransitionTime: time.Now()},
+	{Type: api.ConditionTypeHealth, Status: api.AdapterConditionTrue, LastTransitionTime: time.Now()},
 	{Type: "Progressing", Status: api.AdapterConditionTrue, LastTransitionTime: time.Now()},
 }

Also applies to: 561-577

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/services/cluster_test.go` around lines 227 - 233, The test payloads
constructing []api.AdapterCondition with Type: api.ConditionTypeAvailable and
Status: api.AdapterConditionUnknown are missing the mandatory Applied/Health
conditions, so tests for the "Unknown→Available" path can pass for the wrong
reason; update the test cases (in pkg/services/cluster_test.go where
AdapterCondition is constructed) to include explicit Applied and Health
conditions (e.g., add api.ConditionTypeApplied and api.ConditionTypeHealthy
entries with appropriate Status values) so rejection can only occur via the
intended Unknown→Available logic rather than due to missing mandatory
conditions.
♻️ Duplicate comments (2)
pkg/services/cluster.go (1)

271-278: ⚠️ Potential issue | 🟠 Major

unknownCondition from validation is ignored, so mandatory Unknown can slip through.

ValidateMandatoryConditions returns both missing and unknown mandatory condition types, but this branch only checks missingCondition. As written, updates with Applied=Unknown or Health=Unknown are accepted and upserted.

🔧 Proposed fix
-	// Validate that all mandatory conditions are present
-	if missingCondition, _ := ValidateMandatoryConditions(conditions); missingCondition != "" {
+	// Validate mandatory conditions
+	missingCondition, unknownCondition := ValidateMandatoryConditions(conditions)
+	if missingCondition != "" {
 		// Missing mandatory condition - discard the update
 		ctx = logger.WithClusterID(ctx, clusterID)
 		logger.Info(ctx, fmt.Sprintf("Discarding adapter status update from %s: missing mandatory condition %s",
 			adapterStatus.Adapter, missingCondition))
 		return nil, nil
 	}
+	if unknownCondition != "" && unknownCondition != api.ConditionTypeAvailable {
+		ctx = logger.WithClusterID(ctx, clusterID)
+		logger.Info(ctx, fmt.Sprintf("Discarding adapter status update from %s: mandatory condition %s is Unknown",
+			adapterStatus.Adapter, unknownCondition))
+		return nil, nil
+	}

Also applies to: 280-297

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/services/cluster.go` around lines 271 - 278, The validation currently
ignores the second return (unknownCondition) from ValidateMandatoryConditions so
updates with mandatory conditions set to Unknown slip through; update the branch
that checks ValidateMandatoryConditions(conditions) to capture both return
values (e.g., missingCondition, unknownCondition) and treat either non-empty
value as grounds to discard the update — log a clear message using
logger.WithClusterID(ctx, clusterID) and logger.Info including
adapterStatus.Adapter and the offending condition (preferably naming whether it
is missing or unknown), and apply the same fix to the other similar validation
blocks that call ValidateMandatoryConditions (the later branch handling adapter
status upsert).
pkg/services/node_pool.go (1)

272-279: ⚠️ Potential issue | 🟠 Major

Mandatory Unknown handling is incomplete here as well (unknownCondition is dropped).

This path discards only missing mandatory conditions, so Applied=Unknown / Health=Unknown updates can still be persisted.

🔧 Proposed fix
-	// Validate that all mandatory conditions are present
-	if missingCondition, _ := ValidateMandatoryConditions(conditions); missingCondition != "" {
+	// Validate mandatory conditions
+	missingCondition, unknownCondition := ValidateMandatoryConditions(conditions)
+	if missingCondition != "" {
 		// Missing mandatory condition - discard the update
 		logger.With(ctx, logger.FieldNodePoolID, nodePoolID).
 			Info(fmt.Sprintf("Discarding adapter status update from %s: missing mandatory condition %s",
 				adapterStatus.Adapter, missingCondition))
 		return nil, nil
 	}
+	if unknownCondition != "" && unknownCondition != api.ConditionTypeAvailable {
+		logger.With(ctx, logger.FieldNodePoolID, nodePoolID).
+			Info(fmt.Sprintf("Discarding adapter status update from %s: mandatory condition %s is Unknown",
+				adapterStatus.Adapter, unknownCondition))
+		return nil, nil
+	}

Also applies to: 281-298

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/services/node_pool.go` around lines 272 - 279, The current check only
handles missing mandatory conditions from ValidateMandatoryConditions and
ignores any returned unknownCondition; update the logic around the call to
ValidateMandatoryConditions in node_pool.go so that you capture both return
values (e.g., missingCondition, unknownCondition :=
ValidateMandatoryConditions(conditions)) and if either missingCondition != "" or
unknownCondition != "" log the discard using the same logger.With(ctx,
logger.FieldNodePoolID, nodePoolID) context (include adapterStatus.Adapter and
the specific condition name and whether it was "missing" or "unknown" in the
message) and return nil, nil; apply the same change to the other similar block
handling adapter status updates so Unknown conditions are treated like Missing
and the update is not persisted.
🧹 Nitpick comments (1)
pkg/services/cluster_test.go (1)

500-500: Use the new condition constants consistently in these new test cases.

A few changed tests still hardcode "Available"/"Ready" instead of api.ConditionTypeAvailable / api.ConditionTypeReady, which weakens drift protection.

Also applies to: 563-563, 933-933, 985-985, 1014-1014, 1073-1073, 1099-1099

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/services/cluster_test.go` at line 500, Replace hardcoded condition type
strings in the new tests with the package constants: wherever a Condition struct
uses Type: "Ready" or Type: "Available" (e.g., the occurrence shown in
pkg/services/cluster_test.go and the other locations noted), change those to use
api.ConditionTypeReady and api.ConditionTypeAvailable respectively; update all
instances listed (lines referenced in the review) so tests reference
api.ConditionTypeReady / api.ConditionTypeAvailable instead of the raw string
literals to ensure consistent drift protection.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@pkg/services/cluster_test.go`:
- Around line 227-233: The test payloads constructing []api.AdapterCondition
with Type: api.ConditionTypeAvailable and Status: api.AdapterConditionUnknown
are missing the mandatory Applied/Health conditions, so tests for the
"Unknown→Available" path can pass for the wrong reason; update the test cases
(in pkg/services/cluster_test.go where AdapterCondition is constructed) to
include explicit Applied and Health conditions (e.g., add
api.ConditionTypeApplied and api.ConditionTypeHealthy entries with appropriate
Status values) so rejection can only occur via the intended Unknown→Available
logic rather than due to missing mandatory conditions.

---

Duplicate comments:
In `@pkg/services/cluster.go`:
- Around line 271-278: The validation currently ignores the second return
(unknownCondition) from ValidateMandatoryConditions so updates with mandatory
conditions set to Unknown slip through; update the branch that checks
ValidateMandatoryConditions(conditions) to capture both return values (e.g.,
missingCondition, unknownCondition) and treat either non-empty value as grounds
to discard the update — log a clear message using logger.WithClusterID(ctx,
clusterID) and logger.Info including adapterStatus.Adapter and the offending
condition (preferably naming whether it is missing or unknown), and apply the
same fix to the other similar validation blocks that call
ValidateMandatoryConditions (the later branch handling adapter status upsert).

In `@pkg/services/node_pool.go`:
- Around line 272-279: The current check only handles missing mandatory
conditions from ValidateMandatoryConditions and ignores any returned
unknownCondition; update the logic around the call to
ValidateMandatoryConditions in node_pool.go so that you capture both return
values (e.g., missingCondition, unknownCondition :=
ValidateMandatoryConditions(conditions)) and if either missingCondition != "" or
unknownCondition != "" log the discard using the same logger.With(ctx,
logger.FieldNodePoolID, nodePoolID) context (include adapterStatus.Adapter and
the specific condition name and whether it was "missing" or "unknown" in the
message) and return nil, nil; apply the same change to the other similar block
handling adapter status updates so Unknown conditions are treated like Missing
and the update is not persisted.

---

Nitpick comments:
In `@pkg/services/cluster_test.go`:
- Line 500: Replace hardcoded condition type strings in the new tests with the
package constants: wherever a Condition struct uses Type: "Ready" or Type:
"Available" (e.g., the occurrence shown in pkg/services/cluster_test.go and the
other locations noted), change those to use api.ConditionTypeReady and
api.ConditionTypeAvailable respectively; update all instances listed (lines
referenced in the review) so tests reference api.ConditionTypeReady /
api.ConditionTypeAvailable instead of the raw string literals to ensure
consistent drift protection.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d5df0d0 and 90535fd.

📒 Files selected for processing (8)
  • pkg/api/status_types.go
  • pkg/services/cluster.go
  • pkg/services/cluster_test.go
  • pkg/services/node_pool.go
  • pkg/services/node_pool_test.go
  • pkg/services/status_aggregation.go
  • pkg/services/status_aggregation_test.go
  • test/integration/adapter_status_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/services/status_aggregation_test.go

@yasun1 yasun1 force-pushed the HYPERFLEET-657 branch 2 times, most recently from 42b0d55 to 359fe5f Compare March 3, 2026 11:55
@openshift-ci
Copy link

openshift-ci bot commented Mar 4, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign mbrudnoy for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@yasun1 yasun1 force-pushed the HYPERFLEET-657 branch 3 times, most recently from fbd924f to 983c68f Compare March 4, 2026 09:42
@@ -139,7 +149,7 @@ func TestNodePoolProcessAdapterStatus_SubsequentUnknownCondition(t *testing.T) {
// Pre-populate an existing adapter status
conditions := []api.AdapterCondition{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both TestNodePoolProcessAdapterStatus_SubsequentUnknownCondition and
TestNodePoolProcessAdapterStatus_SubsequentMultipleConditions_AvailableUnknown
send subsequent reports without Applied and Health conditions. This means
they get rejected by ValidateMandatoryConditions (missing mandatory condition)
and never reach the "subsequent Unknown Available" check at all. They pass by
coincidence since both paths return nil, but they don't actually test the
Unknown Available rejection path.

The cluster equivalents of these tests were correctly updated — just apply the
same fix here.

For SubsequentUnknownCondition (line 150), add Applied and Health to the
pre-populated conditions (the subsequent report reuses the same JSON):

Suggested change
conditions := []api.AdapterCondition{
conditions := []api.AdapterCondition{
{Type: api.ConditionTypeAvailable, Status: api.AdapterConditionUnknown, LastTransitionTime: time.Now()},
{Type: api.ConditionTypeApplied, Status: api.AdapterConditionTrue, LastTransitionTime: time.Now()},
{Type: api.ConditionTypeHealth, Status: api.AdapterConditionTrue, LastTransitionTime: time.Now()},
}

@@ -307,12 +337,12 @@ func TestNodePoolProcessAdapterStatus_SubsequentMultipleConditions_AvailableUnkn
// Now send another report with multiple conditions including Available=Unknown
conditions := []api.AdapterCondition{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the same reason mentioned before:

Suggested change
conditions := []api.AdapterCondition{
conditions := []api.AdapterCondition{
{Type: api.ConditionTypeAvailable, Status: api.AdapterConditionUnknown, LastTransitionTime: time.Now()},
{Type: api.ConditionTypeApplied, Status: api.AdapterConditionTrue, LastTransitionTime: time.Now()},
{Type: api.ConditionTypeHealth, Status: api.AdapterConditionTrue, LastTransitionTime: time.Now()},
{Type: api.ConditionTypeReady, Status: api.AdapterConditionTrue, LastTransitionTime: time.Now()},
}

@@ -689,7 +713,7 @@ func TestClusterAvailableReadyTransitions(t *testing.T) {
prevStatus := api.Cluster{}.StatusConditions
prevStatus = append(prevStatus, clusterDao.clusters[clusterID].StatusConditions...)
unknownConds := []api.AdapterCondition{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "Unknown is a no-op" sections in both TestClusterAvailableReadyTransitions
(line ~713) and TestNodePoolAvailableReadyTransitions (line ~497) send only
Available=Unknown without Applied/Health. With the new mandatory validation,
these get rejected for missing conditions before the Unknown Available check
is reached. The assertions still hold (conditions are preserved), but it's
worth adding Applied/Health to keep these sections testing the intended path:

Suggested change
unknownConds := []api.AdapterCondition{
unknownConds := []api.AdapterCondition{
{Type: api.ConditionTypeAvailable, Status: api.AdapterConditionUnknown, LastTransitionTime: time.Now()},
{Type: api.ConditionTypeApplied, Status: api.AdapterConditionTrue, LastTransitionTime: time.Now()},
{Type: api.ConditionTypeHealth, Status: api.AdapterConditionTrue, LastTransitionTime: time.Now()},
}

// Available=Unknown is a no-op (does not store, does not overwrite nodepool conditions).
prevStatus = api.NodePool{}.StatusConditions
prevStatus = append(prevStatus, nodePoolDao.nodePools[nodePoolID].StatusConditions...)
unknownConds := []api.AdapterCondition{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the same reason mentioned.

Suggested change
unknownConds := []api.AdapterCondition{
unknownConds := []api.AdapterCondition{
{Type: api.ConditionTypeAvailable, Status: api.AdapterConditionUnknown, LastTransitionTime: time.Now()},
{Type: api.ConditionTypeApplied, Status: api.AdapterConditionTrue, LastTransitionTime: time.Now()},
{Type: api.ConditionTypeHealth, Status: api.AdapterConditionTrue, LastTransitionTime: time.Now()},
}

Reject adapter status updates missing mandatory conditions (Available,
Applied, Health) or containing Unknown status. Returns HTTP 204 and
preserves existing status to prevent loss of critical cluster state.

Changes:
- Add condition type constants in pkg/api/status_types.go
- Validate mandatory conditions in cluster/nodepool services
- Replace hardcoded condition strings with constants
- Add 4 new tests, update 8 existing tests for validation
- Update integration tests to use constants

All tests passing: 416 unit, 51 integration, 0 lint issues

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

update

update

update

update

fix lint

update

fix unkonwn bypass mandatory issue

reject duplicate condition

update

update

update

update

// Check that all mandatory conditions are present
for _, mandatoryType := range mandatoryConditionTypes {
if !seen[mandatoryType] {
Copy link
Contributor

@rafabene rafabene Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JIRA acceptance criteria for HYPERFLEET-657 say:

"Any status report with any of the mandatory conditions with status Unknown will be discarded"

But ValidateMandatoryConditions doesn't check for Unknown status — it only
validates presence and uniqueness. The Unknown check is only done for
Available (in cluster.go and node_pool.go), so Applied=Unknown or
Health=Unknown pass through silently.

Should we add an Unknown check for all mandatory conditions here? Something
like:

Suggested change
if !seen[mandatoryType] {
if seen[mandatoryType] {
for _, cond := range conditions {
if cond.Type == mandatoryType && cond.Status == api.AdapterConditionUnknown {
return ConditionValidationErrorUnknown, mandatoryType
}
}
}

Then the Available-specific first-report exception in ProcessAdapterStatus
could be moved to run before validation (to bypass the Unknown check only for
the first Available=Unknown report).

If the current behavior is intentional (only Available=Unknown matters), it
would be good to document why Applied/Health Unknown are acceptable so the
JIRA can be updated to reflect the actual contract

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current logic is:

  1. The status report must not have 'Duplicate' or 'Empty' types.
  2. The status report must have all of the mandatory types.
  3. Store the first status report never mind what the Available condition's status is.
  4. Reject the subsequent status report if its Available condition's status is 'Unknown'

This is by design. As we thought at here, currently, Available is the only meaningful condition for our aggregate computation, I am considering of only do the smallest changes until we allow the customized status aggressive. So the Acceptance Criteria in HYPERFLEET-657 is incorrect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants