Skip to content

Add generic resource type with CRD-driven API#50

Open
ciaranRoche wants to merge 9 commits intoopenshift-hyperfleet:mainfrom
ciaranRoche:add-generic-resource-type-poc
Open

Add generic resource type with CRD-driven API#50
ciaranRoche wants to merge 9 commits intoopenshift-hyperfleet:mainfrom
ciaranRoche:add-generic-resource-type-poc

Conversation

@ciaranRoche
Copy link
Contributor

@ciaranRoche ciaranRoche commented Feb 9, 2026

Summary

  • Add generic resource type POC using Kubernetes CRDs to define API resources
  • Enable local development without requiring a Kubernetes cluster
  • Dynamically generate OpenAPI spec and routes from CRD definitions

Changes

CRD-Driven API

  • Load resource definitions from CRD YAML files in charts/crds/
  • Dynamically generate routes and OpenAPI spec at runtime
  • Support both local file loading (CRD_PATH env var) and Kubernetes API

Local Development

  • Add LoadFromDirectory() to pkg/crd/registry.go for loading CRDs from local files
  • Update make run and make run-no-auth to set CRD_PATH automatically
  • No Kubernetes cluster required for local development

Bug Fixes

  • Remove erroneous global list endpoints for owned resources (nodepools, idps)
  • Fix image-dev target to use linux/amd64 platform

Documentation

  • Update README.md, AGENTS.md, and docs/development.md for new workflow
  • Remove references to obsolete make generate commands

Test plan

  • make test passes (319 tests)
  • make run-no-auth starts server with CRDs loaded from local files
  • OpenAPI spec shows correct paths (no global list for owned resources)
  • Swagger UI accessible at /api/hyperfleet/v1/openapi.html

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • CRD-driven API: resources and routes discovered at startup with OpenAPI generated at runtime.
    • Generic Resources endpoints plus Cluster, NodePool, and IDP CRD support (root and nested paths).
    • CLI test script for quick end-to-end API checks.
  • Architecture Changes

    • Consolidated resources storage and DB migrations to a single resources table.
    • Local dev can load CRDs from charts/crds (CRD_PATH) or from the cluster; Helm adds RBAC knobs.
  • Chores

    • Build/runtime updates: FIPS flags, architecture-aware Docker runtime.
  • Documentation

    • Updated developer and troubleshooting workflow for CRD-driven runtime and mocks-focused development.

@coderabbitai
Copy link

coderabbitai bot commented Feb 9, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Replaces the prior static OpenAPI/TypeSpec generation with a CRD-driven, runtime-generated API. Adds CRD manifests (charts/crds), a CRD registry (pkg/crd), dynamic OpenAPI generation (pkg/openapi), generic Resource models/DAO/service/handlers, migrations for a unified resources table, a resources plugin that loads CRDs and registers routes at startup, and test factories/scripts for the new resources flow. Removes legacy Cluster/NodePool models, DAOs, services, handlers, generated OpenAPI artifacts, many related tests, and updates build/tooling (Go 1.25, FIPS flags, mocks-only generation). Server now loads CRDs from CRD_PATH (local) or the Kubernetes API and generates routes and OpenAPI at startup.

Sequence Diagram(s)

sequenceDiagram
    participant App as Application
    participant FS as Local FS / Kubernetes API
    participant CRDReg as CRD Registry
    participant OpenAPI as OpenAPI Generator
    participant Router as HTTP Router
    participant ResourceSvc as ResourceService
    participant DB as Database

    App->>CRDReg: Load CRDs (CRD_PATH or Kubernetes API)
    CRDReg->>FS: Read CRD manifests / K8s CRDs
    FS-->>CRDReg: Return CRD definitions
    CRDReg-->>App: Registry populated

    App->>OpenAPI: GenerateSpec(registry)
    OpenAPI->>CRDReg: Request resource definitions
    CRDReg-->>OpenAPI: Return definitions
    OpenAPI-->>App: Return generated OpenAPI spec

    App->>Router: Register routes per CRD
    Router->>ResourceSvc: Wire handlers for kind
    ResourceSvc-->>DB: Persist/Query resources
    DB-->>ResourceSvc: Return results

    Router->>App: Incoming HTTP request (e.g., POST /clusters)
    Router->>ResourceSvc: Create/Handle resource
    ResourceSvc-->>DB: Persist resource
    DB-->>ResourceSvc: OK
    ResourceSvc-->>Router: Return resource
    Router-->>App: HTTP response (201/200)
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Suggested labels

lgtm, approved

Suggested reviewers

  • aredenba-rh
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 20

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
Makefile (1)

72-72: ⚠️ Potential issue | 🟡 Minor

Go version mismatch between Makefile and go.mod.

The Makefile checks for GO_VERSION:=go1.24. but go.mod specifies go 1.25.0. This will cause the version warning to always trigger during builds.

🔧 Suggested fix
 # Encourage consistent tool versions
-GO_VERSION:=go1.24.
+GO_VERSION:=go1.25.
AGENTS.md (1)

94-124: ⚠️ Potential issue | 🟠 Major

Update AGENTS.md project structure to reflect the new generic resource-driven architecture.

The documentation at lines 94-124 references removed cluster/nodepool-specific files (cluster.go, nodepool.go, cluster_handler.go, nodepool_handler.go, cluster_service.go, nodepool_service.go) that no longer exist. Replace with the current generic resource files:

  • pkg/dao/resource.go (resource CRUD operations)
  • pkg/handlers/resource.go (resource endpoint handlers)
  • pkg/services/resource.go (resource business operations)
  • pkg/api/resource_definition.go, resource_id.go, resource_types.go (API models)
🤖 Fix all issues with AI agents
In `@charts/crds/cluster-crd.yaml`:
- Around line 60-65: The Condition status enum in the CRD `conditions` schema
only includes "True" and "False" but must also include "Unknown" per Kubernetes
metav1 ConditionStatus; update the enum for the condition status field (the YAML
block with "type: string", "description: Status of the condition", "enum: -
\"True\" - \"False\"") to add "- \"Unknown\"" and apply this same change
consistently to the Cluster CRD, IDP CRD and Nodepool CRD condition schemas.

In `@charts/crds/idp-crd.yaml`:
- Around line 75-81: The Condition "status" enum currently only lists "True" and
"False" which is invalid for Kubernetes conditions; update the enum under the
condition's status field (the "status:" block shown in the CRD) to include
"Unknown" as a third value (so the enum contains "True", "False", and "Unknown")
to allow valid ConditionStatus values and pass CRD validation.

In `@charts/crds/nodepool-crd.yaml`:
- Around line 65-70: Update the CRD's condition status enum to include "Unknown"
so it matches AdapterConditionStatus defined in pkg/openapi/common.go; locate
the status field under conditions in nodepool-crd.yaml and add "Unknown" to the
enum values ("True", "False", "Unknown") to prevent validation mismatches when
adapters report Unknown.

In `@charts/templates/rbac/clusterrole-admin.yaml`:
- Around line 1-35: Add a matching ClusterRoleBinding for the admin ClusterRole:
create a new template that, when .Values.rbac.create is true, defines a
ClusterRoleBinding named {{ include "hyperfleet-api.fullname" . }}-admin (or
similar) with roleRef pointing to the ClusterRole named {{ include
"hyperfleet-api.fullname" . }}-admin (apiGroup rbac.authorization.k8s.io, kind
ClusterRole) and a subject that binds the chart's ServiceAccount (use the chart
helper or value used elsewhere, e.g. {{ include
"hyperfleet-api.serviceAccountName" . }} or .Values.serviceAccount.name) in the
same namespace; follow the existing viewer ClusterRoleBinding structure for
field names and templating patterns.

In `@go.mod`:
- Line 3: Update the Makefile to match go.mod by changing the GO_VERSION
variable (currently referenced as GO_VERSION:=go1.24.) to go1.25., and remove
any BoringCrypto/GOEXPERIMENT=boringcrypto cgo-based build flags; instead add
the new Go 1.25 FIPS options: set the build/time environment to enable GOFIPS140
and ensure runtime uses GODEBUG=fips140=on where appropriate (e.g., in
test/run/build targets that previously used GOEXPERIMENT or BoringCrypto), and
document the change in the Makefile comments so CI and local build targets use
go1.25 and the native FIPS 140-3 strategy.
- Line 61: The project pulls a vulnerable transitive dependency
github.com/dgrijalva/jwt-go via github.com/auth0/go-jwt-middleware v0.x; update
the go.mod to require github.com/auth0/go-jwt-middleware v2.x or later (which
removes the dgrijalva jwt-go transitively), run go get
github.com/auth0/go-jwt-middleware@v2 and go mod tidy to update the lock, and
then adjust any usage sites of the middleware (references to
github.com/auth0/go-jwt-middleware types/constructors such as New or Middleware)
to the v2 API surface if needed so imports and function calls compile. Ensure
the vulnerable github.com/dgrijalva/jwt-go entry is no longer present in
go.sum/go.mod after the upgrade.

In `@pkg/crd/registry.go`:
- Around line 298-304: The current loop builds enumValues by converting
schema.Enum entries using string(e.Raw), which preserves JSON quotes for string
enums; instead unmarshal each e.Raw into a variable (e.g., var v interface{})
using json.Unmarshal and assign that unmarshaled value into enumValues[i] so
strings become plain Go strings and numbers/booleans become their native types;
for robustness handle json.Unmarshal error (fallback to raw string or skip) and
then set result["enum"] = enumValues as before.

In `@pkg/dao/resource.go`:
- Around line 161-181: ResourceService.ListByKind must guard against page size
== 0 and log+return early like generic.go: before calling dao.ListByKind, create
a logger with logger.NewOCMLogger(ctx), check the incoming ListArguments.Size
(or pageSize) == 0 and use the logger to emit a structured info/debug message,
then return an empty api.ResourceList and total 0 without calling
sqlResourceDao.ListByKind; ensure the same behavior and message pattern as the
generic handler (early return when Size==0) so callers passing pageSize=0 don't
receive the full result set.

In `@pkg/db/migrations/202602060001_add_resources.go`:
- Around line 37-40: Rename the polymorphic ownership column from owner_kind to
owner_type in the migration (the diff adding owner_id, owner_kind, owner_href)
and update any references to owner_kind across the codebase/migrations to
owner_type so the schema follows the owner_type + owner_id pattern; ensure the
migration uses owner_type VARCHAR(63) NULL (replacing owner_kind) and verify
related queries, structs, and tests reference owner_type instead of owner_kind.

In `@pkg/db/migrations/migration_structs.go`:
- Around line 29-37: The addResources() migration currently only creates the
resources table and indexes but does not backfill existing data, so
dropClusterNodePoolTables() then deletes legacy records; modify addResources()
(or add a new migration immediately after it but before
dropClusterNodePoolTables()) to perform a data migration that copies rows from
clusters and node_pools into the new resources table (mapping fields and
preserving IDs/relations and any required status/condition data and indexes),
ensure the backfill runs inside the same transactional migration or signals
completion before dropClusterNodePoolTables() executes, and verify by
referencing the MigrationList ordering so dropClusterNodePoolTables() is
deferred until after successful backfill.

In `@pkg/handlers/resource.go`:
- Around line 280-287: The ownerHref is being built with h.plural (child
resource plural) instead of the owner's plural, producing incorrect owner URLs;
update the construction in the h.isOwned block to use the owner's plural (e.g.,
h.ownerPlural) rather than h.plural — if h.ownerPlural doesn't exist, add a
field or derive the correct owner path from h.ownerKind (via a small helper that
maps kind → plural) and set ownerHref = fmt.Sprintf("/api/hyperfleet/v1/%s/%s",
ownerPlural, ownerID); keep resource.OwnerID, resource.OwnerKind and
resource.OwnerHref assignment otherwise unchanged.
- Around line 186-193: The Patch handler currently updates found.Spec (marshal
of patch.Spec) but does not increment the resource's generation; ensure
optimistic concurrency by incrementing found.Generation whenever the spec
changes (e.g., inside the Patch handler right after setting found.Spec) before
calling Replace(), or alternatively increment generation inside the service
Replace() path; update the code path that handles spec updates (look for
patch.Spec, found.Spec and the call to Replace()) so that found.Generation++ (or
equivalent) is applied for every spec mutation and persisted by Replace().

In `@pkg/openapi/generator.go`:
- Around line 69-77: Registry.All() may return an internal slice that must not
be mutated; before the sort.Slice call that sorts defs, make a copy of the slice
(e.g., create a new slice with make and copy the elements) and sort the copy so
you do not mutate registry-owned state; update subsequent use to refer to the
copied slice (defsCopy) instead of defs to avoid races with concurrent readers.

In `@pkg/openapi/paths.go`:
- Around line 306-341: buildCreateStatusOperation currently omits the OpenAPI
security requirement; update the returned *openapi3.Operation in
buildCreateStatusOperation to include the same Security value used by other
endpoints (see buildListOperation/buildCreateOperation) by setting the
Operation.Security field to the shared security requirement (use the same helper
or literal used elsewhere so adapters must authenticate when posting status).
- Around line 233-248: The PATCH operation in the OpenAPI path generator
currently references the CreateRequest schema; update the RequestBody for the
PATCH operation (in the code that builds the operation in pkg/openapi/paths.go
where you use def.Kind) to reference "#/components/schemas/" + def.Kind +
"PatchRequest" instead of CreateRequest, ensure RequestBody.Required is false
(or removed) to allow partial payloads, and make sure a corresponding
{Kind}PatchRequest schema is generated in the components/schemas (matching the
handler's ResourcePatchRequest with optional/omitempty Spec and Labels used by
pkg/handlers/resource.go:162).
- Around line 192-195: The GET-by-ID OpenAPI operation currently includes the
global "search" parameter but the handler uses only the path ID; update
buildGetOperation so the params slice contains only the path parameter by
removing the {Ref: "#/components/parameters/search"} entry (i.e., leave only
buildPathParameter(def.Singular+"_id", def.Kind+" ID") in the params slice).
Ensure buildGetOperation (and any related parameter construction code) no longer
references the search parameter for the single-resource GET.

In `@pkg/openapi/schemas.go`:
- Around line 270-358: The code in applySchemaProperties and
convertCRDSchemaToOpenAPI is silently dropping constraints because it
type-asserts to narrow Go types (e.g., []string, int64) after unmarshalling;
update these functions to accept generic unmarshalled types: convert required
when it's []interface{} to []string and assign to schema.Required; accept
minLength/maxLength as float64 or int (from interface{}), convert to uint64 and
set schema.MinLength/schema.MaxLength (use pointers where needed); handle
additionalProperties when it's a bool or an object (map[string]interface{}) and
set schema.AdditionalProperties accordingly; likewise ensure items/properties
handling remains by converting map[string]interface{} to SchemaRefs. Apply
changes to the functions convertCRDSchemaToOpenAPI and applySchemaProperties and
to fields schema.Required, schema.MinLength, schema.MaxLength,
schema.AdditionalProperties, and schema.Properties/Items conversions so
constraints aren't lost.

In `@pkg/services/resource.go`:
- Around line 111-117: The Replace method currently calls s.resourceDao.Replace
without bumping the resource.Generation; update sqlResourceService.Replace to
load the current resource (via s.resourceDao.Get or equivalent), compare the
incoming resource.Spec with the stored resource.Spec, and if they differ
increment resource.Generation (e.g., resource.Generation++ ) before calling
s.resourceDao.Replace; ensure you handle the case where the existing resource is
not found or Get returns an error and propagate that error through the same
error handling path (use handleUpdateError for Replace failures).

In `@plugins/resources/plugin.go`:
- Around line 159-165: The fallback in getOwnerPlural returns kind+"s" which
preserves uppercase (e.g., "Cluster" -> "Clusters") causing route mismatches;
update the fallback to use lowercase by importing "strings" and returning
strings.ToLower(kind) + "s" in the getOwnerPlural function so API paths become
correct (e.g., "clusters").
- Around line 102-107: Remove the redundant value-type presenter registrations
for api.Resource: delete the calls that register presenters using api.Resource{}
(the value type) — specifically the presenters.RegisterPath(api.Resource{},
"resources") and presenters.RegisterKind(api.Resource{}, "Resource") lines — and
keep only the pointer registrations (presenters.RegisterPath(&api.Resource{},
"resources") and presenters.RegisterKind(&api.Resource{}, "Resource")) so
runtime lookups using &api.Resource{} will match.
🧹 Nitpick comments (13)
scripts/test-api.sh (1)

5-31: Consider failing fast on HTTP errors in the test script.

set -e alone won’t fail on non‑2xx curl responses. Adding pipefail and using curl -f/-S makes the script more reliable when the API returns errors.

🔧 Suggested change
-set -e
+set -euo pipefail
AGENTS.md (1)

132-138: Add language identifier to fenced code block.

The code block showing the CRD-driven flow lacks a language specifier, which triggers the markdown linter warning.

📝 Suggested fix
-```
+```text
 charts/crds/*.yaml (CRD definitions)
     ↓ loaded at startup
 pkg/crd/registry.go (CRD registry)
     ↓ generates
 Dynamic routes + OpenAPI spec at runtime
</details>

</blockquote></details>
<details>
<summary>pkg/handlers/resource.go (1)</summary><blockquote>

`204-204`: **TODO: Hardcoded user identity should be extracted from auth context.**

The `UpdatedBy` field is hardcoded to `"system@hyperfleet.local"`. Similar hardcoding exists at lines 276-277 for `CreatedBy` and `UpdatedBy` in the create path.

This is acceptable for initial implementation, but ensure this is tracked for follow-up before production use.


Would you like me to open an issue to track implementing proper user identity extraction from the authentication context?

</blockquote></details>
<details>
<summary>test/factories/resources.go (1)</summary><blockquote>

`100-104`: **Redundant `OwnerID` assignment.**

Line 102 sets `nodePool.OwnerID = &cluster.ID`, but this is already set in `NewNodePoolResource` at line 66 where `OwnerID: &clusterID` is passed. The comment says "for test compatibility" but the value is identical.

If specific tests require a different pattern, consider documenting why; otherwise, this assignment is redundant.


<details>
<summary>♻️ Suggested cleanup</summary>

```diff
 	// Create the nodepool owned by the cluster
 	nodePool, err := f.NewNodePoolResource(id, cluster.ID)
 	if err != nil {
 		return nil, fmt.Errorf("failed to create nodepool: %w", err)
 	}

-	// Set OwnerID for test compatibility (some tests check this)
-	nodePool.OwnerID = &cluster.ID
-
 	return nodePool, nil
cmd/hyperfleet-api/server/routes.go (1)

119-152: Consider caching the generated OpenAPI spec to avoid duplicate generation.

The OpenAPI spec is generated twice at startup: once here for schema validation (line 139) and once in NewOpenAPIHandler() (pkg/handlers/openapi.go). While this only affects startup time, consider caching the generated spec (e.g., on the CRD registry or a shared singleton) to avoid the redundant computation.

💡 Sketch of potential consolidation
// In pkg/openapi/generator.go or pkg/crd/registry.go
var (
    cachedSpec     *openapi3.T
    cachedSpecOnce sync.Once
)

func GetOrGenerateSpec(registry *crd.Registry) *openapi3.T {
    cachedSpecOnce.Do(func() {
        cachedSpec = GenerateSpec(registry)
    })
    return cachedSpec
}

Then both routes.go and openapi.go can call openapi.GetOrGenerateSpec(crd.DefaultRegistry()).

pkg/crd/registry_test.go (1)

64-75: Make the empty-directory test deterministic.

Relying on ../../bin is environment-dependent; t.TempDir() makes this test stable.

♻️ Suggested change
 func TestLoadFromDirectory_EmptyDirectory(t *testing.T) {
 	RegisterTestingT(t)

 	registry := NewRegistry()
-	// Use a directory that exists but has no YAML files
-	err := registry.LoadFromDirectory("../../bin")
+	emptyDir := t.TempDir()
+	err := registry.LoadFromDirectory(emptyDir)

-	// Should succeed but load nothing (bin may not exist, so we just check no panic)
-	if err == nil {
-		Expect(registry.Count()).To(Equal(0))
-	}
+	Expect(err).To(BeNil())
+	Expect(registry.Count()).To(Equal(0))
 }
pkg/dao/resource.go (1)

143-150: Consider verifying kind in Delete to prevent cross-resource-type deletions.

Delete accepts only an ID and doesn't verify the resource kind. If UUIDs are reused or collide across resource types (unlikely but possible), this could delete the wrong resource. The DeleteByKindAndID method exists but callers must remember to use it.

If intentional for internal use, consider documenting this or making Delete private.

pkg/openapi/generator_test.go (1)

52-79: Consider adding negative test cases.

The happy-path tests are solid. Consider adding tests for edge cases:

  • Registering an owned resource without its owner
  • Duplicate registrations
  • Disabled resources (Enabled: false)
pkg/openapi/paths.go (1)

77-81: Silent failure when owner definition is missing could hide configuration errors.

When an owned resource's owner definition isn't found in the registry, the function silently returns without generating paths. Consider logging a warning to help diagnose misconfigured CRDs.

Proposed fix
 func addOwnedResourcePaths(doc *openapi3.T, def *api.ResourceDefinition, registry *crd.Registry) {
 	ownerDef := getOwnerDefinitionFromRegistry(def, registry)
 	if ownerDef == nil {
+		// Log warning to help diagnose missing owner CRD
+		// logger.Warn("Owner definition not found for owned resource", "kind", def.Kind, "owner", def.Owner.Kind)
 		return // Cannot generate paths without owner definition
 	}
pkg/api/resource_definition.go (1)

45-52: Potential naming collision with validators.ResourceSchema.

There's another ResourceSchema type in pkg/validators/schema_validator.go (lines 12-15) with different fields (TypeName string, Schema *openapi3.SchemaRef). While they're in different packages and won't cause compilation errors, having two types with the same name serving different purposes could cause confusion.

Consider renaming one of them (e.g., CRDResourceSchema here or ValidatorResourceSchema there) to improve clarity.

pkg/crd/registry.go (1)

70-112: Consider reducing lock scope during Kubernetes API calls.

The write lock is held for the entire duration including the Kubernetes API call (line 86), which involves network I/O. While this is acceptable for startup-only loading, if this needs to support runtime reloading in the future, consider:

  1. Fetching CRDs without the lock
  2. Processing results
  3. Acquiring the lock only for the final registration

This is fine for the current use case since loading happens at startup.

pkg/services/resource.go (2)

186-189: Silent error handling may hide issues.

When json.Unmarshal fails (line 187), the adapter status is silently skipped with continue. This could mask data corruption or schema mismatches. Consider logging a warning:

🔧 Proposed fix
 		// Unmarshal Conditions from JSONB
 		var conditions []api.AdapterCondition
 		if err := json.Unmarshal(adapterStatus.Conditions, &conditions); err != nil {
-			continue // Skip if can't unmarshal
+			logger.WithError(ctx, err).Warn("Failed to unmarshal adapter conditions, skipping")
+			continue
 		}

264-275: Fragile error handling with string matching.

Lines 267-271 mix errors.Is with string-based error checking (strings.Contains). This is brittle because:

  1. Error messages can change
  2. CodeNotFoundGeneric in error strings doesn't guarantee the actual error type

Consider refactoring to use typed error checking consistently:

🔧 Proposed fix
 	existingStatus, findErr := s.adapterStatusDao.FindByResourceAndAdapter(
 		ctx, kind, resourceID, adapterStatus.Adapter,
 	)
-	if findErr != nil && !stderrors.Is(findErr, gorm.ErrRecordNotFound) {
-		if !strings.Contains(findErr.Error(), errors.CodeNotFoundGeneric) {
-			return nil, errors.GeneralError("Failed to get adapter status: %s", findErr)
-		}
+	if findErr != nil && !stderrors.Is(findErr, gorm.ErrRecordNotFound) {
+		// If it's not a "not found" scenario, it's a real error
+		return nil, errors.GeneralError("Failed to get adapter status: %s", findErr)
 	}

If FindByResourceAndAdapter can return a custom "not found" error type, consider having the DAO return gorm.ErrRecordNotFound or a typed error that can be checked with errors.Is.

Comment on lines +1 to +35
{{- if .Values.rbac.create -}}
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: {{ include "hyperfleet-api.fullname" . }}-admin
labels:
{{- include "hyperfleet-api.labels" . | nindent 4 }}
rules:
# Full CRUD access to HyperFleet resources
- apiGroups:
- hyperfleet.io
resources:
- clusters
- nodepools
- idps
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
# Status subresource update permissions
- apiGroups:
- hyperfleet.io
resources:
- clusters/status
- nodepools/status
- idps/status
verbs:
- get
- patch
- update
{{- end }}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, explore the RBAC templates directory structure
echo "=== RBAC Templates Directory ==="
find charts/templates/rbac -type f -name "*.yaml" | sort

echo -e "\n=== ClusterRoleBinding Content ==="
if [ -f "charts/templates/rbac/clusterrolebinding.yaml" ]; then
  cat -n charts/templates/rbac/clusterrolebinding.yaml
else
  echo "clusterrolebinding.yaml not found"
fi

echo -e "\n=== Searching for 'admin' in RBAC yaml files ==="
grep -r "admin" charts/templates/rbac/ --include="*.yaml" -i

Repository: openshift-hyperfleet/hyperfleet-api

Length of output: 1051


🏁 Script executed:

#!/bin/bash
# Check if clusterrole.yaml is the viewer role to confirm the pattern
echo "=== Content of clusterrole.yaml ==="
head -20 charts/templates/rbac/clusterrole.yaml

Repository: openshift-hyperfleet/hyperfleet-api

Length of output: 664


Create a ClusterRoleBinding for the admin role to match the viewer role pattern.

The admin ClusterRole is defined but has no corresponding ClusterRoleBinding. The viewer role (crd-viewer) follows a clear pattern: role definition in clusterrole.yaml paired with binding in clusterrolebinding.yaml. The admin role should follow the same pattern with a binding that grants the admin role to the ServiceAccount.

🧰 Tools
🪛 YAMLlint (1.38.0)

[error] 1-1: syntax error: expected the node content, but found '-'

(syntax)

🤖 Prompt for AI Agents
In `@charts/templates/rbac/clusterrole-admin.yaml` around lines 1 - 35, Add a
matching ClusterRoleBinding for the admin ClusterRole: create a new template
that, when .Values.rbac.create is true, defines a ClusterRoleBinding named {{
include "hyperfleet-api.fullname" . }}-admin (or similar) with roleRef pointing
to the ClusterRole named {{ include "hyperfleet-api.fullname" . }}-admin
(apiGroup rbac.authorization.k8s.io, kind ClusterRole) and a subject that binds
the chart's ServiceAccount (use the chart helper or value used elsewhere, e.g.
{{ include "hyperfleet-api.serviceAccountName" . }} or
.Values.serviceAccount.name) in the same namespace; follow the existing viewer
ClusterRoleBinding structure for field names and templating patterns.

@openshift-ci
Copy link

openshift-ci bot commented Mar 4, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign ciaranroche for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ciaranRoche and others added 6 commits March 4, 2026 11:22
Implements a generic resource API that allows defining new resource types
via YAML configuration files, without requiring code changes.

Features:
- Generic resource handler with CRUD operations for any resource type
- CRD-style resource definitions in config/crds/
- Kubernetes CRDs in Helm chart for API discoverability
- RBAC ClusterRoles for viewer and admin access
- Database migration for generic resources table
- Plugin architecture for resource type registration
- ARM64 cross-compilation support in Dockerfile

Resource types defined:
- Cluster (root-level, cluster-scoped in K8s)
- NodePool (owned by Cluster, namespaced in K8s)
- IDP (owned by Cluster, namespaced in K8s)

The CRDs are for schema discoverability only - PostgreSQL remains
the source of truth (not a K8s operator pattern).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove all cluster-specific and nodepool-specific handlers, services,
  DAOs, presenters, and API types
- Remove old database migrations for clusters/node_pools tables
- Add migration to drop legacy cluster/nodepool tables
- Update CRD registry to load resource definitions from Kubernetes API
  instead of filesystem (config/crds directory removed)
- Add hyperfleet.io annotations to Helm CRDs for API configuration:
  - hyperfleet.io/scope: Root or Owned
  - hyperfleet.io/owner-kind: Parent resource kind
  - hyperfleet.io/owner-path-param: URL path parameter
  - hyperfleet.io/required-adapters: Comma-separated adapter list
  - hyperfleet.io/enabled: Enable/disable flag
- Update RBAC to allow listing CRDs cluster-wide
- Add generic test factory for Resource instances
- Update Dockerfile to remove config/crds copy

The generic resources table now handles all resource types dynamically
based on CRD definitions installed in the Kubernetes cluster.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace static OpenAPI specification with dynamic generation from CRD
definitions loaded from Kubernetes. This ensures the API documentation
always matches the actual CRD schemas.

Changes:
- Add pkg/openapi package with generator, paths, schemas, and common modules
- Extract openAPIV3Schema from CRDs when loading into registry
- Update OpenAPI handler to use dynamic generator
- Update schema validator to work with generated specs
- Remove static openapi/openapi.yaml and oapi-codegen.yaml
- Update Makefile to remove generate target (types are now static)
- Update Dockerfile to remove static OpenAPI file copy

The generated Go types in pkg/api/openapi/openapi.gen.go are preserved
as they are still used for request/response handling throughout the
codebase.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add ability to load CRDs from local YAML files via CRD_PATH environment
variable, allowing local development without requiring a Kubernetes cluster.

Changes:
- Add LoadFromDirectory() to pkg/crd/registry.go for loading CRDs from files
- Update plugins/resources/plugin.go to check CRD_PATH env var first
- Set CRD_PATH in Makefile run/run-no-auth targets
- Remove erroneous global list endpoints for owned resources
- Update docs/development.md for new CRD-driven workflow
- Rename Default() to DefaultRegistry() to avoid gomega conflict

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove references to obsolete make generate and make generate-all
- Update README.md installation steps and common commands
- Update AGENTS.md development workflow and troubleshooting
- Remove global list endpoint from NodePool documentation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add "Unknown" to condition status enum in all CRDs per K8s metav1.ConditionStatus
- Add ClusterRoleBinding for admin ClusterRole
- Migrate Makefile from BoringCrypto/GOEXPERIMENT to Go 1.25 native FIPS 140-3 (GOFIPS140/GODEBUG)
- Add BearerAuth security requirement to buildCreateStatusOperation
- Fix silent constraint drops in CRD-to-OpenAPI schema conversion (handle []interface{}, float64, additionalProperties as object)
- Bump generation in Replace when spec changes
- Remove redundant value-type presenter registrations
- Fix getOwnerPlural fallback to use lowercase

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
Makefile (1)

145-154: ⚠️ Potential issue | 🔴 Critical

Make the Go version check fail the install target when version doesn't match.

The current version guard is non-blocking—a Go version mismatch prints a warning but allows make install to succeed, which risks silently building with the wrong toolchain. Since Go 1.25 is pinned for FIPS-compliant builds, enforcing the version check is necessary to prevent incorrect binary generation.

Suggested fix
 install: check-gopath generate-mocks
 	CGO_ENABLED=$(CGO_ENABLED) GOFIPS140=$(GOFIPS140) ${GO} install -ldflags="$(ldflags)" ./cmd/hyperfleet-api
 	@ ${GO} version | grep -q "$(GO_VERSION)" || \
 		( \
 			printf '\033[41m\033[97m\n'; \
 			echo "* Your go version is not the expected $(GO_VERSION) *" | sed 's/./*/g'; \
 			echo "* Your go version is not the expected $(GO_VERSION) *"; \
 			echo "* Your go version is not the expected $(GO_VERSION) *" | sed 's/./*/g'; \
 			printf '\033[0m'; \
+			exit 1; \
 		)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 145 - 154, The install target currently prints a
warning on Go version mismatch but does not fail; update the fallback block
after the "${GO} version | grep -q "$(GO_VERSION)" || ( ... )" check to
terminate the shell with a non-zero status (e.g., add "exit 1" at the end of
that parenthesized block) so make stops on a version mismatch; reference the
install target and the ${GO} / $(GO_VERSION) version check to locate and change
the code.
♻️ Duplicate comments (2)
pkg/openapi/paths.go (2)

233-240: ⚠️ Potential issue | 🟠 Major

PATCH operation still uses create schema instead of a partial-update schema.

At Line 233-240, PATCH references {Kind}CreateRequest (and keeps Required: true), which does not match partial-update semantics and over-constrains clients.

OpenAPI PATCH contract fix
 	RequestBody: &openapi3.RequestBodyRef{
 		Value: &openapi3.RequestBody{
-			Required: true,
+			Required: false,
 			Content: openapi3.Content{
 				"application/json": &openapi3.MediaType{
 					Schema: &openapi3.SchemaRef{
-						Ref: "#/components/schemas/" + def.Kind + "CreateRequest",
+						Ref: "#/components/schemas/" + def.Kind + "PatchRequest",
 					},
 				},
 			},
 		},
 	},

You’ll also need to generate {Kind}PatchRequest in pkg/openapi/schemas.go (optional fields only, matching handler patch behavior).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/openapi/paths.go` around lines 233 - 240, The PATCH operation currently
uses the full {Kind}CreateRequest schema and marks RequestBody Required: true,
which is incorrect for partial updates; change the openapi3.RequestBodyRef used
in the PATCH path (the SchemaRef that currently sets Ref:
"#/components/schemas/{Kind}CreateRequest") to point to a new
"{Kind}PatchRequest" schema and set Required to false (or remove the required
constraint) so partial fields are allowed; also add a new optional-only
"{Kind}PatchRequest" schema in the schemas generator (pkg/openapi/schemas.go)
that mirrors the resource fields but makes them optional to match handler patch
behavior.

192-195: ⚠️ Potential issue | 🟡 Minor

Remove search from GET-by-ID operation parameters.

At Line 192-195, the single-resource GET operation still includes #/components/parameters/search, which is a list/filter concern, not an ID lookup concern.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/openapi/paths.go` around lines 192 - 195, The GET-by-ID operation
currently constructs params with a global search parameter and a path ID
parameter; remove the list/filter parameter so only the ID path parameter
remains. In the code that builds the operation params (the params variable
containing openapi3.ParameterRef and buildPathParameter(def.Singular+"_id",
def.Kind+" ID")), delete the entry with Ref: "#/components/parameters/search"
and leave only the buildPathParameter call so the GET-by-ID uses only the path
parameter. Ensure no other logic expects the search param for single-resource
GET operations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@Makefile`:
- Around line 248-252: Update the stale comment/header for the Makefile target
run/docs so it no longer claims to “Run Swagger and host the api docs”; replace
that header with a concise informational description such as “Print information
about the dynamically generated OpenAPI docs” and adjust the following
explanatory line to clarify it only prints the URL (keep the existing echo
commands and note about dynamic OpenAPI generation unchanged); target to edit:
run/docs in the Makefile.

In `@pkg/services/resource.go`:
- Around line 118-123: The current logic only sets resource.Generation when the
Spec changes, which can let a missing/zero Generation slip through on unchanged
specs; update the block around existing, resource and s.resourceDao.Replace so
that if bytes.Equal(existing.Spec, resource.Spec) you explicitly set
resource.Generation = existing.Generation (preserve it), otherwise set
resource.Generation = existing.Generation + 1 (increment on change), then call
s.resourceDao.Replace; reference the existing, resource, Generation, Spec and
s.resourceDao.Replace symbols when making the change.
- Around line 103-107: Create currently calls UpdateResourceStatusFromAdapters
after the resource is already persisted and returns its error, causing a false
"create failed" while the resource exists; change Create to not return the
adapter-aggregation error after persistence: call
UpdateResourceStatusFromAdapters (s.UpdateResourceStatusFromAdapters) but on
svcErr do not propagate it to the caller—log it (or enqueue a background retry)
and return the committed resource (resource/updatedResource) and nil error;
ensure any retry/async handling references resource.Kind, resource.ID and
requiredAdapters so status aggregation can be retried without causing duplicate
creates.

In `@plugins/resources/plugin.go`:
- Around line 141-147: The routes currently register only standard CRUD via
router.HandleFunc("", handler.List/Create) and router.HandleFunc("/{id}",
handler.Get/Patch/Delete); add registrations for the status subresource so the
OpenAPI paths are served by wiring router.HandleFunc("/{id}/statuses", ...) for
GET and POST. Specifically, add router.HandleFunc("/{id}/statuses",
handler.ListStatuses).Methods(http.MethodGet) and
router.HandleFunc("/{id}/statuses",
handler.CreateStatus).Methods(http.MethodPost) (or the equivalent handler
methods your resource handler implements) so the documented GET/POST
/{resource_id}/statuses endpoints do not 404.

---

Outside diff comments:
In `@Makefile`:
- Around line 145-154: The install target currently prints a warning on Go
version mismatch but does not fail; update the fallback block after the "${GO}
version | grep -q "$(GO_VERSION)" || ( ... )" check to terminate the shell with
a non-zero status (e.g., add "exit 1" at the end of that parenthesized block) so
make stops on a version mismatch; reference the install target and the ${GO} /
$(GO_VERSION) version check to locate and change the code.

---

Duplicate comments:
In `@pkg/openapi/paths.go`:
- Around line 233-240: The PATCH operation currently uses the full
{Kind}CreateRequest schema and marks RequestBody Required: true, which is
incorrect for partial updates; change the openapi3.RequestBodyRef used in the
PATCH path (the SchemaRef that currently sets Ref:
"#/components/schemas/{Kind}CreateRequest") to point to a new
"{Kind}PatchRequest" schema and set Required to false (or remove the required
constraint) so partial fields are allowed; also add a new optional-only
"{Kind}PatchRequest" schema in the schemas generator (pkg/openapi/schemas.go)
that mirrors the resource fields but makes them optional to match handler patch
behavior.
- Around line 192-195: The GET-by-ID operation currently constructs params with
a global search parameter and a path ID parameter; remove the list/filter
parameter so only the ID path parameter remains. In the code that builds the
operation params (the params variable containing openapi3.ParameterRef and
buildPathParameter(def.Singular+"_id", def.Kind+" ID")), delete the entry with
Ref: "#/components/parameters/search" and leave only the buildPathParameter call
so the GET-by-ID uses only the path parameter. Ensure no other logic expects the
search param for single-resource GET operations.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 61d374fa-d2c9-4fac-a3c3-53fb9c8cf300

📥 Commits

Reviewing files that changed from the base of the PR and between c11a80e and 52e3a10.

📒 Files selected for processing (10)
  • Makefile
  • charts/crds/cluster-crd.yaml
  • charts/crds/idp-crd.yaml
  • charts/crds/nodepool-crd.yaml
  • charts/templates/rbac/clusterrolebinding-admin.yaml
  • charts/values.yaml
  • pkg/openapi/paths.go
  • pkg/openapi/schemas.go
  • pkg/services/resource.go
  • plugins/resources/plugin.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • charts/crds/idp-crd.yaml
  • charts/crds/nodepool-crd.yaml
  • charts/values.yaml

Makefile Outdated
Comment on lines +248 to +252
# Run Swagger and host the api docs
# Note: With dynamic OpenAPI generation, use the /api/hyperfleet/v1/openapi.html endpoint instead
run/docs:
@echo "OpenAPI spec is now dynamically generated from CRDs."
@echo "Start the server and visit: http://localhost:8000/api/hyperfleet/v1/openapi.html"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Target description is stale: run/docs no longer “runs Swagger.”

The comment/header still says “Run Swagger and host the api docs,” but target behavior is now informational only. Please align wording to avoid operator confusion.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 248 - 252, Update the stale comment/header for the
Makefile target run/docs so it no longer claims to “Run Swagger and host the api
docs”; replace that header with a concise informational description such as
“Print information about the dynamically generated OpenAPI docs” and adjust the
following explanatory line to clarify it only prints the URL (keep the existing
echo commands and note about dynamic OpenAPI generation unchanged); target to
edit: run/docs in the Makefile.

The POC generates OpenAPI specs dynamically from CRDs at runtime,
so the static oapi-codegen generation step is no longer needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
AGENTS.md (1)

27-27: ⚠️ Potential issue | 🟡 Minor

Go version inconsistency with go.mod.

The documentation states Go 1.24.9, but go.mod has been updated to Go 1.25.0. Update this line to reflect the correct version.

-- **Language**: Go 1.24.9 with FIPS-compliant crypto (CGO_ENABLED=1, GOEXPERIMENT=boringcrypto)
+- **Language**: Go 1.25.0 with FIPS-compliant crypto
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@AGENTS.md` at line 27, Update the Go version string in AGENTS.md that
currently reads "Go 1.24.9" to match go.mod by changing it to "Go 1.25.0" (the
line starting with "**Language**: Go ..."); ensure the FIPS/CGO flags text
(CGO_ENABLED=1, GOEXPERIMENT=boringcrypto) remains unchanged.
docs/development.md (1)

9-9: ⚠️ Potential issue | 🟡 Minor

Update Go version requirement.

The documentation states "Go 1.24 or higher", but based on project configuration, the project now uses Go 1.25 for FIPS-compliant crypto requirements. Update the prerequisite to reflect the actual minimum version.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/development.md` at line 9, Update the Go prerequisite text that
currently reads "**Go 1.24 or higher**" to "**Go 1.25 or higher**" to match the
project's FIPS-compliant crypto requirement; search for the exact string "**Go
1.24 or higher**" in docs (e.g., development.md and any README/guide files) and
replace it with "**Go 1.25 or higher**", and also ensure any related badge or
CI/Tooling docs referencing Go 1.24 are updated to 1.25 for consistency.
♻️ Duplicate comments (10)
pkg/openapi/generator.go (1)

69-77: ⚠️ Potential issue | 🟠 Major

Copy defs before sorting to avoid mutating shared registry state.

Sorting defs in place can mutate registry-owned ordering and cause shared-state/race issues.

🔧 Proposed fix
 	// Get all resource definitions and sort them for deterministic output
 	defs := registry.All()
+	defs = append(defs[:0:0], defs...) // copy before sort
 	sort.Slice(defs, func(i, j int) bool {
 		// Root resources first, then by kind name
 		if defs[i].IsRoot() != defs[j].IsRoot() {
 			return defs[i].IsRoot()
 		}
 		return defs[i].Kind < defs[j].Kind
 	})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/openapi/generator.go` around lines 69 - 77, The current code sorts the
slice returned by registry.All() in-place (defs) which can mutate registry-owned
state; to fix it, make a shallow copy of the slice returned by registry.All()
before calling sort.Slice so you don't change the original registry slice, then
run the existing sort.Slice on that copy (keep the same comparison using
IsRoot() and Kind) and use the copied, sorted slice for downstream processing.
pkg/db/migrations/202602060001_add_resources.go (1)

37-40: ⚠️ Potential issue | 🟠 Major

Use owner_type (not owner_kind) for polymorphic ownership.

This migration introduces owner_kind, which diverges from the polymorphic ownership convention and will cause schema inconsistency.

🔧 Proposed fix
-					owner_kind VARCHAR(63) NULL,
+					owner_type VARCHAR(63) NULL,

As per coding guidelines: "Implement polymorphic status tables using owner_type + owner_id pattern to support multiple resource types without separate status tables".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/db/migrations/202602060001_add_resources.go` around lines 37 - 40, The
migration adds owner_kind which breaks the polymorphic convention; replace the
owner_kind column with owner_type and keep owner_id/owner_href, i.e., alter the
DDL in 202602060001_add_resources.go to create owner_type VARCHAR(63) NULL
instead of owner_kind, update any associated indexes/constraints to reference
owner_type, and search/rename all code references (migrations, models, queries)
that use owner_kind to use owner_type so the schema follows the
owner_type+owner_id polymorphic pattern.
pkg/openapi/schemas.go (1)

310-316: ⚠️ Potential issue | 🟠 Major

Preserve integer minimum/maximum constraints during CRD conversion.

minimum/maximum are parsed only as float64; integer-typed values can be silently ignored and constraints get dropped.

🔧 Proposed fix
-	if min, ok := crdSchema["minimum"].(float64); ok {
+	if min, ok := toFloat64(crdSchema["minimum"]); ok {
 		schema.Min = &min
 	}
 
-	if max, ok := crdSchema["maximum"].(float64); ok {
+	if max, ok := toFloat64(crdSchema["maximum"]); ok {
 		schema.Max = &max
 	}
+func toFloat64(v interface{}) (float64, bool) {
+	switch val := v.(type) {
+	case float64:
+		return val, true
+	case float32:
+		return float64(val), true
+	case int:
+		return float64(val), true
+	case int64:
+		return float64(val), true
+	case uint64:
+		return float64(val), true
+	default:
+		return 0, false
+	}
+}
In Go, when Kubernetes CRD/OpenAPI schema YAML is unmarshaled into map[string]interface{}, can numeric fields like "minimum" and "maximum" be int/int64 (not only float64)?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/openapi/schemas.go` around lines 310 - 316, The conversion drops integer
`minimum`/`maximum` because crdSchema values are assumed to be float64; update
the handling in the block that sets schema.Min and schema.Max to accept other
numeric types (int, int64, json.Number) via a type switch or reflection, convert
them to float64, and then assign pointers to schema.Min/schema.Max (keep the
existing variable names crdSchema, schema.Min, schema.Max and the same
assignment pattern); also handle string numeric values by parsing if present so
integer constraints are preserved during CRD conversion.
pkg/openapi/paths.go (2)

192-195: ⚠️ Potential issue | 🟡 Minor

Remove the search parameter from GET by ID operation.

The buildGetOperation includes the search parameter (line 193), but single-resource GET operations retrieve by ID only—the handler doesn't use search filtering. This misleads API clients into thinking they can filter a single resource lookup.

Proposed fix
 	params := []*openapi3.ParameterRef{
-		{Ref: "#/components/parameters/search"},
 		buildPathParameter(def.Singular+"_id", def.Kind+" ID"),
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/openapi/paths.go` around lines 192 - 195, Remove the erroneous search
parameter from the single-resource GET operation: inside buildGetOperation where
params is constructed (the params variable currently contains {Ref:
"#/components/parameters/search"} and buildPathParameter(def.Singular+"_id",
def.Kind+" ID")), drop the search parameter entry so the params slice only
includes the path ID parameter built by buildPathParameter; update any related
comments/tests to reflect that GET by ID accepts only the ID path parameter.

233-248: ⚠️ Potential issue | 🟠 Major

PATCH operation incorrectly references CreateRequest schema.

The PATCH operation (line 239) references {Kind}CreateRequest, but PATCH semantics require partial updates. The handler expects ResourcePatchRequest with optional Spec and Labels fields. This mismatch causes OpenAPI clients to generate incorrect request types. Generate and reference a {Kind}PatchRequest schema instead.

Proposed fix
 	return &openapi3.Operation{
 		OperationID: operationID,
 		Summary:     summary,
 		Parameters:  params,
 		RequestBody: &openapi3.RequestBodyRef{
 			Value: &openapi3.RequestBody{
-				Required: true,
+				Required: false,
 				Content: openapi3.Content{
 					"application/json": &openapi3.MediaType{
 						Schema: &openapi3.SchemaRef{
-							Ref: "#/components/schemas/" + def.Kind + "CreateRequest",
+							Ref: "#/components/schemas/" + def.Kind + "PatchRequest",
 						},
 					},
 				},
 			},
 		},

Also ensure addResourceSchemas generates a corresponding {Kind}PatchRequest schema with optional fields.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/openapi/paths.go` around lines 233 - 248, The PATCH operation currently
uses the CreateRequest schema; update the RequestBody SchemaRef in the PATCH
handler (the RequestBody block in pkg/openapi/paths.go that currently builds
"#/components/schemas/" + def.Kind + "CreateRequest") to reference
"#/components/schemas/" + def.Kind + "PatchRequest" instead, and update
addResourceSchemas to emit a {Kind}PatchRequest schema where Spec and Labels are
optional (nullable/omitted) fields matching the handler's ResourcePatchRequest
shape so OpenAPI clients generate partial-update types.
plugins/resources/plugin.go (1)

141-150: ⚠️ Potential issue | 🟠 Major

Missing status subresource route registration.

The routes register standard CRUD operations but omit the status subresource endpoints (/{id}/statuses) that are defined in the OpenAPI spec (see pkg/openapi/paths.go lines 65-68 and 101-105). Without these handlers, documented endpoints will return 404.

Proposed fix
 	// Register standard CRUD routes
 	router.HandleFunc("", handler.List).Methods(http.MethodGet)
 	router.HandleFunc("", handler.Create).Methods(http.MethodPost)
 	router.HandleFunc("/{id}", handler.Get).Methods(http.MethodGet)
 	router.HandleFunc("/{id}", handler.Patch).Methods(http.MethodPatch)
 	router.HandleFunc("/{id}", handler.Delete).Methods(http.MethodDelete)
+	router.HandleFunc("/{id}/statuses", handler.ListStatuses).Methods(http.MethodGet)
+	router.HandleFunc("/{id}/statuses", handler.CreateStatus).Methods(http.MethodPost)

Ensure ResourceHandler implements ListStatuses and CreateStatus methods.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@plugins/resources/plugin.go` around lines 141 - 150, The router registration
is missing the status subresource routes; add handlers for GET and POST on
"/{id}/statuses" by registering router.HandleFunc("/{id}/statuses",
handler.ListStatuses).Methods(http.MethodGet) and
router.HandleFunc("/{id}/statuses",
handler.CreateStatus).Methods(http.MethodPost) (place these alongside the other
CRUD registrations, before applying router.Use calls) and ensure the
ResourceHandler type implements the ListStatuses and CreateStatus methods
expected by these routes.
pkg/handlers/resource.go (1)

280-287: ⚠️ Potential issue | 🟠 Major

Bug: ownerHref uses child resource's plural instead of owner's plural.

Line 283 constructs ownerHref using h.plural (the child resource's plural, e.g., "nodepools"), but the owner href should use the owner's plural (e.g., "clusters"). This produces incorrect URLs like /api/hyperfleet/v1/nodepools/{id} instead of /api/hyperfleet/v1/clusters/{id}.

Proposed fix
 	// Set owner references for owned resources
 	if h.isOwned {
 		ownerID := mux.Vars(r)[h.ownerPathParam]
-		ownerHref := fmt.Sprintf("/api/hyperfleet/v1/%s/%s", h.plural, ownerID)
+		ownerHref := fmt.Sprintf("/api/hyperfleet/v1/%s/%s", getOwnerPlural(h.ownerKind), ownerID)
 		resource.OwnerID = &ownerID
 		resource.OwnerKind = &h.ownerKind
 		resource.OwnerHref = &ownerHref
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/handlers/resource.go` around lines 280 - 287, The ownerHref is built with
the child resource plural (h.plural) causing incorrect URLs; change the
construction to use the owner's plural (e.g., h.ownerPlural) instead of h.plural
so ownerHref := fmt.Sprintf("/api/hyperfleet/v1/%s/%s", h.ownerPlural, ownerID);
keep the rest (setting resource.OwnerID, resource.OwnerKind, resource.OwnerHref)
the same and ensure you reference the existing ownerPathParam and h.ownerKind
fields.
pkg/services/resource.go (2)

103-107: ⚠️ Potential issue | 🟠 Major

Don’t surface create failure after the resource is already committed.

At Line 103-Line 107, Create returns svcErr when aggregation fails, but the resource was already persisted at Line 98. That yields false-negative creates and retry-driven duplicates.

Suggested patch
 	updatedResource, svcErr := s.UpdateResourceStatusFromAdapters(ctx, resource.Kind, resource.ID, requiredAdapters)
 	if svcErr != nil {
-		return nil, svcErr
+		ctx = logger.WithResourceID(ctx, resource.ID)
+		ctx = logger.WithResourceType(ctx, resource.Kind)
+		logger.WithError(ctx, svcErr).Warn("Create succeeded but initial status aggregation failed")
+		return resource, nil
 	}

 	return updatedResource, nil
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/services/resource.go` around lines 103 - 107, Create currently returns
svcErr from UpdateResourceStatusFromAdapters after the resource is already
persisted, causing false-negative results; change Create so that after
persisting the resource (the earlier commit) it does not return the
adapter-aggregation error to callers. Instead call
UpdateResourceStatusFromAdapters (the function named
svc.UpdateResourceStatusFromAdapters / variable updatedResource) and if it
returns an error, log or record the error internally and continue to return the
created resource and nil error to the caller; do not propagate svcErr back from
Create to avoid retries/duplicates.

118-120: ⚠️ Potential issue | 🟠 Major

Preserve generation when spec is unchanged.

At Line 118-Line 120, generation is only set for changed specs. If unchanged, caller-provided zero/stale generation can be persisted.

Suggested patch
 	if !bytes.Equal(existing.Spec, resource.Spec) {
 		resource.Generation = existing.Generation + 1
+	} else {
+		resource.Generation = existing.Generation
 	}

As per coding guidelines: "Increment the generation field on each spec update and have adapters report observed_generation in status for optimistic concurrency control".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/services/resource.go` around lines 118 - 120, The current logic only sets
resource.Generation when specs differ, which allows a caller-supplied zero or
stale Generation to be persisted when specs are unchanged; update the code
around the bytes.Equal(existing.Spec, resource.Spec) check so that when specs
are equal you explicitly preserve the existing.Generation (set
resource.Generation = existing.Generation), and when specs differ you increment
(set resource.Generation = existing.Generation + 1); apply this change where
existing and resource are compared to ensure Generation is never overwritten
with a zero/stale value.
pkg/crd/registry.go (1)

298-304: ⚠️ Potential issue | 🟡 Minor

Enum conversion still uses raw JSON bytes (quoted strings issue).

At Line 301, string(e.Raw) preserves JSON token formatting, so string enums can become quoted literals instead of plain strings.

Suggested patch
 import (
 	"context"
+	"encoding/json"
 	"fmt"
@@
 	if len(schema.Enum) > 0 {
 		enumValues := make([]interface{}, len(schema.Enum))
 		for i, e := range schema.Enum {
-			enumValues[i] = string(e.Raw)
+			var v interface{}
+			if err := json.Unmarshal(e.Raw, &v); err == nil {
+				enumValues[i] = v
+			} else {
+				enumValues[i] = string(e.Raw)
+			}
 		}
 		result["enum"] = enumValues
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/crd/registry.go` around lines 298 - 304, The enum conversion is
incorrectly using string(e.Raw) which preserves JSON quoting; change the loop
that builds enumValues (iterating schema.Enum and using e.Raw) to unmarshal each
e.Raw into a Go value (e.g., into an interface{}) and append that unmarshaled
value to enumValues so strings become plain Go strings and numbers/booleans
become their native types; ensure encoding/json is imported and handle/unwrap
any unmarshal errors in the function that builds result["enum"].
🧹 Nitpick comments (2)
pkg/validators/schema_validator.go (1)

68-78: Consider logging when validation is skipped for unknown resource types.

Returning nil for unknown resource types silently skips validation. While the comment explains this is intentional for graceful handling, consider adding debug-level logging to help diagnose issues when validation isn't applied as expected.

🔧 Optional: Add debug logging for skipped validation
 func (v *SchemaValidator) Validate(resourceType string, spec map[string]interface{}) error {
 	resourceSchema := v.schemas[resourceType]
 	if resourceSchema == nil {
 		// Unknown resource type, skip validation
+		// Consider: slog.Debug("validation skipped for unknown resource type", "resourceType", resourceType)
 		return nil
 	}
 
 	return v.validateSpec(spec, resourceSchema.Schema, resourceSchema.TypeName)
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/validators/schema_validator.go` around lines 68 - 78, The Validate method
in SchemaValidator currently returns nil for unknown resource types without any
trace; update SchemaValidator.Validate to emit a debug-level log when
resourceSchema is nil (e.g., "skipping validation: unknown resource type %s") to
aid debugging. Use the validator's existing logger field (e.g., v.logger or
v.log) to call Debug/Debugf; if SchemaValidator has no logger, add a logger
field and initialize it, then log inside the resourceSchema == nil branch before
returning nil so skipped validations are recorded.
pkg/crd/registry.go (1)

219-221: Parse hyperfleet.io/enabled with boolean parsing for robustness.

At Line 220, only the exact string "true" enables the resource (besides empty default). Using boolean parsing avoids surprising behavior for case variants and lets you reject invalid literals explicitly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/crd/registry.go` around lines 219 - 221, Replace the fragile string check
around AnnotationEnabled by using strconv.ParseBool: read enabledStr :=
annotations[AnnotationEnabled]; if enabledStr == "" { enabled = true } else { b,
err := strconv.ParseBool(enabledStr); if err != nil { return fmt.Errorf("invalid
value for %s: %q", AnnotationEnabled, enabledStr) } enabled = b }. This uses
strconv.ParseBool for case-insensitive true/false parsing and explicitly returns
an error on invalid literals; import strconv (and fmt) as needed and adjust the
surrounding function to propagate the error.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/hyperfleet-api/server/routes.go`:
- Around line 129-152: The code currently continues when schema validator
creation fails (using schemaPath with validators.NewSchemaValidator or
generating spec with validators.NewSchemaValidatorFromSpec), leaving
schemaValidator nil and enabling a fail-open path; change the control flow so
initialization failures are treated as fatal: when err != nil for either
NewSchemaValidator or NewSchemaValidatorFromSpec, stop startup (return the error
from the enclosing function or call logger.Fatal/Exit) rather than logging a
warning and proceeding, ensuring
router.Use(middleware.SchemaValidationMiddleware(schemaValidator)) only runs
when schemaValidator was successfully created and the server does not start with
validation disabled.

In `@Makefile`:
- Around line 194-212: The test targets (test, ci-test-unit, test-integration,
ci-test-integration) set GODEBUG but do not pass the GOFIPS140 build flag into
the go test invocations; update each target to export/pass
GOFIPS140=$(GOFIPS140) into the $(GOTESTSUM) / go test command (e.g., prefix the
command with GOFIPS140=$(GOFIPS140) or add it to the environment list) so test
binaries are built with the same GOFIPS140 setting as build/install, ensuring
consistent crypto behavior for functions invoked during tests.

In `@pkg/crd/registry.go`:
- Around line 70-73: LoadFromKubernetes is acquiring the lock and then
appending/merging CRD data without clearing existing registry state, which
causes stale entries and duplicate enabled items in Registry.all on repeated
calls; update the method (and the other loading call around the same area) to
reset relevant fields (clear maps and slices such as r.all and any underlying
maps/stores on the Registry struct) immediately after acquiring the lock and
before repopulating them, ensuring you preserve the mutex locking
(r.mu.Lock()/defer r.mu.Unlock()) and reinitialize maps/slices to empty
instances rather than nil if code assumes non-nil collections.

In `@pkg/openapi/common.go`:
- Around line 207-214: ResourceConditionStatus currently defines only "True" and
"False"; update the enum for doc.Components.Schemas["ResourceConditionStatus"]
to include "Unknown" as well so it matches Kubernetes metav1.ConditionStatus and
AdapterConditionStatus; locate the SchemaRef creation for
ResourceConditionStatus and add "Unknown" to the Enum slice alongside "True" and
"False".

In `@test/factories/resources.go`:
- Around line 156-157: Replace direct calls to
dbFactory.New(context.Background()) with context-bound sessions obtained via
db.NewContext(ctx): ensure the factory functions accept or use the existing
context variable (ctx) and call db.NewContext(ctx) instead of
dbFactory.New(context.Background()); update the places where dbSession is used
(e.g., the dbSession.Model(resource).Update("status_conditions",
conditionsJSON).Error and similar mutations) to use the new context-bound
dbSession, and remove any direct creation of GORM connections so all DB
mutations use db.NewContext(ctx).

---

Outside diff comments:
In `@AGENTS.md`:
- Line 27: Update the Go version string in AGENTS.md that currently reads "Go
1.24.9" to match go.mod by changing it to "Go 1.25.0" (the line starting with
"**Language**: Go ..."); ensure the FIPS/CGO flags text (CGO_ENABLED=1,
GOEXPERIMENT=boringcrypto) remains unchanged.

In `@docs/development.md`:
- Line 9: Update the Go prerequisite text that currently reads "**Go 1.24 or
higher**" to "**Go 1.25 or higher**" to match the project's FIPS-compliant
crypto requirement; search for the exact string "**Go 1.24 or higher**" in docs
(e.g., development.md and any README/guide files) and replace it with "**Go 1.25
or higher**", and also ensure any related badge or CI/Tooling docs referencing
Go 1.24 are updated to 1.25 for consistency.

---

Duplicate comments:
In `@pkg/crd/registry.go`:
- Around line 298-304: The enum conversion is incorrectly using string(e.Raw)
which preserves JSON quoting; change the loop that builds enumValues (iterating
schema.Enum and using e.Raw) to unmarshal each e.Raw into a Go value (e.g., into
an interface{}) and append that unmarshaled value to enumValues so strings
become plain Go strings and numbers/booleans become their native types; ensure
encoding/json is imported and handle/unwrap any unmarshal errors in the function
that builds result["enum"].

In `@pkg/db/migrations/202602060001_add_resources.go`:
- Around line 37-40: The migration adds owner_kind which breaks the polymorphic
convention; replace the owner_kind column with owner_type and keep
owner_id/owner_href, i.e., alter the DDL in 202602060001_add_resources.go to
create owner_type VARCHAR(63) NULL instead of owner_kind, update any associated
indexes/constraints to reference owner_type, and search/rename all code
references (migrations, models, queries) that use owner_kind to use owner_type
so the schema follows the owner_type+owner_id polymorphic pattern.

In `@pkg/handlers/resource.go`:
- Around line 280-287: The ownerHref is built with the child resource plural
(h.plural) causing incorrect URLs; change the construction to use the owner's
plural (e.g., h.ownerPlural) instead of h.plural so ownerHref :=
fmt.Sprintf("/api/hyperfleet/v1/%s/%s", h.ownerPlural, ownerID); keep the rest
(setting resource.OwnerID, resource.OwnerKind, resource.OwnerHref) the same and
ensure you reference the existing ownerPathParam and h.ownerKind fields.

In `@pkg/openapi/generator.go`:
- Around line 69-77: The current code sorts the slice returned by registry.All()
in-place (defs) which can mutate registry-owned state; to fix it, make a shallow
copy of the slice returned by registry.All() before calling sort.Slice so you
don't change the original registry slice, then run the existing sort.Slice on
that copy (keep the same comparison using IsRoot() and Kind) and use the copied,
sorted slice for downstream processing.

In `@pkg/openapi/paths.go`:
- Around line 192-195: Remove the erroneous search parameter from the
single-resource GET operation: inside buildGetOperation where params is
constructed (the params variable currently contains {Ref:
"#/components/parameters/search"} and buildPathParameter(def.Singular+"_id",
def.Kind+" ID")), drop the search parameter entry so the params slice only
includes the path ID parameter built by buildPathParameter; update any related
comments/tests to reflect that GET by ID accepts only the ID path parameter.
- Around line 233-248: The PATCH operation currently uses the CreateRequest
schema; update the RequestBody SchemaRef in the PATCH handler (the RequestBody
block in pkg/openapi/paths.go that currently builds "#/components/schemas/" +
def.Kind + "CreateRequest") to reference "#/components/schemas/" + def.Kind +
"PatchRequest" instead, and update addResourceSchemas to emit a
{Kind}PatchRequest schema where Spec and Labels are optional (nullable/omitted)
fields matching the handler's ResourcePatchRequest shape so OpenAPI clients
generate partial-update types.

In `@pkg/openapi/schemas.go`:
- Around line 310-316: The conversion drops integer `minimum`/`maximum` because
crdSchema values are assumed to be float64; update the handling in the block
that sets schema.Min and schema.Max to accept other numeric types (int, int64,
json.Number) via a type switch or reflection, convert them to float64, and then
assign pointers to schema.Min/schema.Max (keep the existing variable names
crdSchema, schema.Min, schema.Max and the same assignment pattern); also handle
string numeric values by parsing if present so integer constraints are preserved
during CRD conversion.

In `@pkg/services/resource.go`:
- Around line 103-107: Create currently returns svcErr from
UpdateResourceStatusFromAdapters after the resource is already persisted,
causing false-negative results; change Create so that after persisting the
resource (the earlier commit) it does not return the adapter-aggregation error
to callers. Instead call UpdateResourceStatusFromAdapters (the function named
svc.UpdateResourceStatusFromAdapters / variable updatedResource) and if it
returns an error, log or record the error internally and continue to return the
created resource and nil error to the caller; do not propagate svcErr back from
Create to avoid retries/duplicates.
- Around line 118-120: The current logic only sets resource.Generation when
specs differ, which allows a caller-supplied zero or stale Generation to be
persisted when specs are unchanged; update the code around the
bytes.Equal(existing.Spec, resource.Spec) check so that when specs are equal you
explicitly preserve the existing.Generation (set resource.Generation =
existing.Generation), and when specs differ you increment (set
resource.Generation = existing.Generation + 1); apply this change where existing
and resource are compared to ensure Generation is never overwritten with a
zero/stale value.

In `@plugins/resources/plugin.go`:
- Around line 141-150: The router registration is missing the status subresource
routes; add handlers for GET and POST on "/{id}/statuses" by registering
router.HandleFunc("/{id}/statuses",
handler.ListStatuses).Methods(http.MethodGet) and
router.HandleFunc("/{id}/statuses",
handler.CreateStatus).Methods(http.MethodPost) (place these alongside the other
CRUD registrations, before applying router.Use calls) and ensure the
ResourceHandler type implements the ListStatuses and CreateStatus methods
expected by these routes.

---

Nitpick comments:
In `@pkg/crd/registry.go`:
- Around line 219-221: Replace the fragile string check around AnnotationEnabled
by using strconv.ParseBool: read enabledStr := annotations[AnnotationEnabled];
if enabledStr == "" { enabled = true } else { b, err :=
strconv.ParseBool(enabledStr); if err != nil { return fmt.Errorf("invalid value
for %s: %q", AnnotationEnabled, enabledStr) } enabled = b }. This uses
strconv.ParseBool for case-insensitive true/false parsing and explicitly returns
an error on invalid literals; import strconv (and fmt) as needed and adjust the
surrounding function to propagate the error.

In `@pkg/validators/schema_validator.go`:
- Around line 68-78: The Validate method in SchemaValidator currently returns
nil for unknown resource types without any trace; update
SchemaValidator.Validate to emit a debug-level log when resourceSchema is nil
(e.g., "skipping validation: unknown resource type %s") to aid debugging. Use
the validator's existing logger field (e.g., v.logger or v.log) to call
Debug/Debugf; if SchemaValidator has no logger, add a logger field and
initialize it, then log inside the resourceSchema == nil branch before returning
nil so skipped validations are recorded.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e6d2b10d-2cdc-425e-aa6b-5efff901f493

📥 Commits

Reviewing files that changed from the base of the PR and between 52e3a10 and c3fa2c7.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (75)
  • AGENTS.md
  • Dockerfile
  • Makefile
  • README.md
  • charts/Chart.yaml
  • charts/crds/cluster-crd.yaml
  • charts/crds/idp-crd.yaml
  • charts/crds/nodepool-crd.yaml
  • charts/templates/rbac/clusterrole-admin.yaml
  • charts/templates/rbac/clusterrole.yaml
  • charts/templates/rbac/clusterrolebinding-admin.yaml
  • charts/templates/rbac/clusterrolebinding.yaml
  • charts/values.yaml
  • cmd/hyperfleet-api/main.go
  • cmd/hyperfleet-api/server/routes.go
  • docs/development.md
  • go.mod
  • openapi/oapi-codegen.yaml
  • openapi/openapi.yaml
  • pkg/api/cluster_types.go
  • pkg/api/cluster_types_test.go
  • pkg/api/node_pool_types.go
  • pkg/api/node_pool_types_test.go
  • pkg/api/presenters/cluster.go
  • pkg/api/presenters/cluster_test.go
  • pkg/api/presenters/node_pool.go
  • pkg/api/presenters/node_pool_test.go
  • pkg/api/resource_definition.go
  • pkg/api/resource_types.go
  • pkg/crd/registry.go
  • pkg/crd/registry_test.go
  • pkg/dao/cluster.go
  • pkg/dao/mocks/cluster.go
  • pkg/dao/mocks/node_pool.go
  • pkg/dao/node_pool.go
  • pkg/dao/resource.go
  • pkg/db/migrations/202511111044_add_clusters.go
  • pkg/db/migrations/202511111055_add_node_pools.go
  • pkg/db/migrations/202601210001_add_conditions_gin_index.go
  • pkg/db/migrations/202602060001_add_resources.go
  • pkg/db/migrations/202602070001_drop_cluster_nodepool_tables.go
  • pkg/db/migrations/migration_structs.go
  • pkg/handlers/cluster.go
  • pkg/handlers/cluster_nodepools.go
  • pkg/handlers/cluster_nodepools_test.go
  • pkg/handlers/cluster_status.go
  • pkg/handlers/node_pool.go
  • pkg/handlers/nodepool_status.go
  • pkg/handlers/openapi.go
  • pkg/handlers/resource.go
  • pkg/openapi/common.go
  • pkg/openapi/generator.go
  • pkg/openapi/generator_test.go
  • pkg/openapi/paths.go
  • pkg/openapi/schemas.go
  • pkg/services/cluster.go
  • pkg/services/cluster_test.go
  • pkg/services/generic.go
  • pkg/services/generic_test.go
  • pkg/services/node_pool.go
  • pkg/services/node_pool_test.go
  • pkg/services/resource.go
  • pkg/validators/schema_validator.go
  • pkg/validators/schema_validator_test.go
  • plugins/clusters/plugin.go
  • plugins/nodePools/plugin.go
  • plugins/resources/plugin.go
  • scripts/test-api.sh
  • test/factories/clusters.go
  • test/factories/node_pools.go
  • test/factories/resources.go
  • test/integration/adapter_status_test.go
  • test/integration/clusters_test.go
  • test/integration/node_pools_test.go
  • test/integration/search_field_mapping_test.go
💤 Files with no reviewable changes (35)
  • pkg/db/migrations/202601210001_add_conditions_gin_index.go
  • pkg/handlers/nodepool_status.go
  • pkg/api/presenters/node_pool.go
  • plugins/clusters/plugin.go
  • pkg/handlers/cluster_nodepools.go
  • openapi/openapi.yaml
  • pkg/services/node_pool_test.go
  • pkg/api/cluster_types.go
  • pkg/handlers/cluster_nodepools_test.go
  • pkg/services/cluster_test.go
  • pkg/api/presenters/node_pool_test.go
  • pkg/dao/node_pool.go
  • pkg/api/presenters/cluster.go
  • test/integration/adapter_status_test.go
  • pkg/db/migrations/202511111044_add_clusters.go
  • pkg/handlers/cluster.go
  • pkg/db/migrations/202511111055_add_node_pools.go
  • plugins/nodePools/plugin.go
  • test/integration/search_field_mapping_test.go
  • pkg/services/cluster.go
  • pkg/handlers/cluster_status.go
  • openapi/oapi-codegen.yaml
  • pkg/dao/cluster.go
  • pkg/api/cluster_types_test.go
  • pkg/handlers/node_pool.go
  • pkg/api/node_pool_types.go
  • pkg/services/node_pool.go
  • pkg/dao/mocks/cluster.go
  • test/factories/clusters.go
  • test/integration/node_pools_test.go
  • pkg/api/presenters/cluster_test.go
  • test/factories/node_pools.go
  • pkg/api/node_pool_types_test.go
  • pkg/dao/mocks/node_pool.go
  • test/integration/clusters_test.go
🚧 Files skipped from review as they are similar to previous changes (12)
  • pkg/services/generic_test.go
  • scripts/test-api.sh
  • pkg/validators/schema_validator_test.go
  • charts/crds/cluster-crd.yaml
  • charts/crds/idp-crd.yaml
  • pkg/db/migrations/202602070001_drop_cluster_nodepool_tables.go
  • pkg/api/resource_definition.go
  • pkg/crd/registry_test.go
  • pkg/services/generic.go
  • charts/Chart.yaml
  • charts/crds/nodepool-crd.yaml
  • pkg/dao/resource.go

Comment on lines +129 to 152
if schemaPath != "" {
// Production: Load schema from file (Helm sets OPENAPI_SCHEMA_PATH=/etc/hyperfleet/schemas/openapi.yaml)
schemaValidator, err = validators.NewSchemaValidator(schemaPath)
if err != nil {
logger.With(ctx, logger.FieldSchemaPath, schemaPath).WithError(err).Warn("Failed to load schema validator from file")
} else {
logger.With(ctx, logger.FieldSchemaPath, schemaPath).Info("Schema validation enabled from file")
}
} else {
// Apply schema validation middleware
logger.With(ctx, logger.FieldSchemaPath, schemaPath).Info("Schema validation enabled")
// Default: Generate schema dynamically from CRD registry
spec := openapi.GenerateSpec(crd.DefaultRegistry())
schemaValidator, err = validators.NewSchemaValidatorFromSpec(spec)
if err != nil {
logger.WithError(ctx, err).Warn("Failed to create schema validator from generated spec")
} else {
logger.Info(ctx, "Schema validation enabled from dynamically generated spec")
}
}

if schemaValidator != nil {
router.Use(middleware.SchemaValidationMiddleware(schemaValidator))
} else {
logger.Warn(ctx, "Schema validation is disabled. Spec fields will not be validated.")
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fail fast when schema validation initialization fails.

If loading/generating the schema validator fails, the code continues with validation disabled. That creates a fail-open path for invalid spec payloads.

🔧 Proposed fix
 	if schemaPath != "" {
 		// Production: Load schema from file (Helm sets OPENAPI_SCHEMA_PATH=/etc/hyperfleet/schemas/openapi.yaml)
 		schemaValidator, err = validators.NewSchemaValidator(schemaPath)
 		if err != nil {
-			logger.With(ctx, logger.FieldSchemaPath, schemaPath).WithError(err).Warn("Failed to load schema validator from file")
+			check(fmt.Errorf("failed to load schema validator from %s: %w", schemaPath, err),
+				"Unable to initialize schema validator")
 		} else {
 			logger.With(ctx, logger.FieldSchemaPath, schemaPath).Info("Schema validation enabled from file")
 		}
 	} else {
 		// Default: Generate schema dynamically from CRD registry
 		spec := openapi.GenerateSpec(crd.DefaultRegistry())
 		schemaValidator, err = validators.NewSchemaValidatorFromSpec(spec)
 		if err != nil {
-			logger.WithError(ctx, err).Warn("Failed to create schema validator from generated spec")
+			check(err, "Unable to initialize schema validator from generated spec")
 		} else {
 			logger.Info(ctx, "Schema validation enabled from dynamically generated spec")
 		}
 	}
 
-	if schemaValidator != nil {
-		router.Use(middleware.SchemaValidationMiddleware(schemaValidator))
-	} else {
-		logger.Warn(ctx, "Schema validation is disabled. Spec fields will not be validated.")
-	}
+	router.Use(middleware.SchemaValidationMiddleware(schemaValidator))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/hyperfleet-api/server/routes.go` around lines 129 - 152, The code
currently continues when schema validator creation fails (using schemaPath with
validators.NewSchemaValidator or generating spec with
validators.NewSchemaValidatorFromSpec), leaving schemaValidator nil and enabling
a fail-open path; change the control flow so initialization failures are treated
as fatal: when err != nil for either NewSchemaValidator or
NewSchemaValidatorFromSpec, stop startup (return the error from the enclosing
function or call logger.Fatal/Exit) rather than logging a warning and
proceeding, ensuring
router.Use(middleware.SchemaValidationMiddleware(schemaValidator)) only runs
when schemaValidator was successfully created and the server does not start with
validation disabled.

Comment on lines +194 to 212
GODEBUG=fips140=on OCM_ENV=unit_testing $(GOTESTSUM) --format $(TEST_SUMMARY_FORMAT) -- -p 1 -v $(TESTFLAGS) \
./pkg/... \
./cmd/...

.PHONY: ci-test-unit
ci-test-unit: install secrets $(GOTESTSUM) ## Run unit tests with JSON output
OCM_ENV=unit_testing $(GOTESTSUM) --jsonfile-timing-events=$(unit_test_json_output) --format $(TEST_SUMMARY_FORMAT) -- -p 1 -v $(TESTFLAGS) \
GODEBUG=fips140=on OCM_ENV=unit_testing $(GOTESTSUM) --jsonfile-timing-events=$(unit_test_json_output) --format $(TEST_SUMMARY_FORMAT) -- -p 1 -v $(TESTFLAGS) \
./pkg/... \
./cmd/...

.PHONY: test-integration
test-integration: install secrets $(GOTESTSUM) ## Run integration tests
TESTCONTAINERS_RYUK_DISABLED=true OCM_ENV=integration_testing $(GOTESTSUM) --format $(TEST_SUMMARY_FORMAT) -- -p 1 -ldflags -s -v -timeout 1h $(TESTFLAGS) \
GODEBUG=fips140=on TESTCONTAINERS_RYUK_DISABLED=true OCM_ENV=integration_testing $(GOTESTSUM) --format $(TEST_SUMMARY_FORMAT) -- -p 1 -ldflags -s -v -timeout 1h $(TESTFLAGS) \
./test/integration

.PHONY: ci-test-integration
ci-test-integration: install secrets $(GOTESTSUM) ## Run integration tests with JSON output
TESTCONTAINERS_RYUK_DISABLED=true OCM_ENV=integration_testing $(GOTESTSUM) --jsonfile-timing-events=$(integration_test_json_output) --format $(TEST_SUMMARY_FORMAT) -- -p 1 -ldflags -s -v -timeout 1h $(TESTFLAGS) \
GODEBUG=fips140=on TESTCONTAINERS_RYUK_DISABLED=true OCM_ENV=integration_testing $(GOTESTSUM) --jsonfile-timing-events=$(integration_test_json_output) --format $(TEST_SUMMARY_FORMAT) -- -p 1 -ldflags -s -v -timeout 1h $(TESTFLAGS) \
./test/integration
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify where GOFIPS140 is (and isn't) propagated in Makefile test/build paths.
rg -n -A2 -B2 '^(build|install|cmds|test|ci-test-unit|test-integration|ci-test-integration):|GOFIPS140=|GODEBUG=fips140=on' Makefile

Repository: openshift-hyperfleet/hyperfleet-api

Length of output: 2875


Pass GOFIPS140 into go test invocations for consistency with build/install.

The test, ci-test-unit, test-integration, and ci-test-integration targets set GODEBUG=fips140=on but do not pass GOFIPS140=$(GOFIPS140) when compiling test binaries. This creates a mismatch: production binaries built via build/install/cmds use GOFIPS140=$(GOFIPS140), but test binaries are compiled without this flag, potentially using different crypto implementations during testing.

Suggested patch
-GODEBUG=fips140=on OCM_ENV=unit_testing $(GOTESTSUM) --format $(TEST_SUMMARY_FORMAT) -- -p 1 -v $(TESTFLAGS) \
+GOFIPS140=$(GOFIPS140) GODEBUG=fips140=on OCM_ENV=unit_testing $(GOTESTSUM) --format $(TEST_SUMMARY_FORMAT) -- -p 1 -v $(TESTFLAGS) \
 		./pkg/... \
 		./cmd/...

-GODEBUG=fips140=on OCM_ENV=unit_testing $(GOTESTSUM) --jsonfile-timing-events=$(unit_test_json_output) --format $(TEST_SUMMARY_FORMAT) -- -p 1 -v $(TESTFLAGS) \
+GOFIPS140=$(GOFIPS140) GODEBUG=fips140=on OCM_ENV=unit_testing $(GOTESTSUM) --jsonfile-timing-events=$(unit_test_json_output) --format $(TEST_SUMMARY_FORMAT) -- -p 1 -v $(TESTFLAGS) \
 		./pkg/... \
 		./cmd/...

-GODEBUG=fips140=on TESTCONTAINERS_RYUK_DISABLED=true OCM_ENV=integration_testing $(GOTESTSUM) --format $(TEST_SUMMARY_FORMAT) -- -p 1 -ldflags -s -v -timeout 1h $(TESTFLAGS) \
+GOFIPS140=$(GOFIPS140) GODEBUG=fips140=on TESTCONTAINERS_RYUK_DISABLED=true OCM_ENV=integration_testing $(GOTESTSUM) --format $(TEST_SUMMARY_FORMAT) -- -p 1 -ldflags -s -v -timeout 1h $(TESTFLAGS) \
 			./test/integration

-GODEBUG=fips140=on TESTCONTAINERS_RYUK_DISABLED=true OCM_ENV=integration_testing $(GOTESTSUM) --jsonfile-timing-events=$(integration_test_json_output) --format $(TEST_SUMMARY_FORMAT) -- -p 1 -ldflags -s -v -timeout 1h $(TESTFLAGS) \
+GOFIPS140=$(GOFIPS140) GODEBUG=fips140=on TESTCONTAINERS_RYUK_DISABLED=true OCM_ENV=integration_testing $(GOTESTSUM) --jsonfile-timing-events=$(integration_test_json_output) --format $(TEST_SUMMARY_FORMAT) -- -p 1 -ldflags -s -v -timeout 1h $(TESTFLAGS) \
 			./test/integration
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 194 - 212, The test targets (test, ci-test-unit,
test-integration, ci-test-integration) set GODEBUG but do not pass the GOFIPS140
build flag into the go test invocations; update each target to export/pass
GOFIPS140=$(GOFIPS140) into the $(GOTESTSUM) / go test command (e.g., prefix the
command with GOFIPS140=$(GOFIPS140) or add it to the environment list) so test
binaries are built with the same GOFIPS140 setting as build/install, ensuring
consistent crypto behavior for functions invoked during tests.

Comment on lines +70 to +73
func (r *Registry) LoadFromKubernetes(ctx context.Context) error {
r.mu.Lock()
defer r.mu.Unlock()

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reset registry state before loading new CRDs.

At Line 70 and Line 117, load starts without clearing existing maps/slice. Repeated calls can keep stale CRDs and duplicate enabled entries in Registry.all.

Suggested patch
 func (r *Registry) LoadFromKubernetes(ctx context.Context) error {
 	r.mu.Lock()
 	defer r.mu.Unlock()
+	r.byKind = make(map[string]*api.ResourceDefinition)
+	r.byPlural = make(map[string]*api.ResourceDefinition)
+	r.all = make([]*api.ResourceDefinition, 0)

 	// Create Kubernetes client
 	config, err := getKubeConfig()
@@
 func (r *Registry) LoadFromDirectory(dir string) error {
 	r.mu.Lock()
 	defer r.mu.Unlock()
+	r.byKind = make(map[string]*api.ResourceDefinition)
+	r.byPlural = make(map[string]*api.ResourceDefinition)
+	r.all = make([]*api.ResourceDefinition, 0)

 	// Find all YAML files
 	files, err := os.ReadDir(dir)

Also applies to: 117-118

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/crd/registry.go` around lines 70 - 73, LoadFromKubernetes is acquiring
the lock and then appending/merging CRD data without clearing existing registry
state, which causes stale entries and duplicate enabled items in Registry.all on
repeated calls; update the method (and the other loading call around the same
area) to reset relevant fields (clear maps and slices such as r.all and any
underlying maps/stores on the Registry struct) immediately after acquiring the
lock and before repopulating them, ensuring you preserve the mutex locking
(r.mu.Lock()/defer r.mu.Unlock()) and reinitialize maps/slices to empty
instances rather than nil if code assumes non-nil collections.

Comment on lines +207 to +214
// ResourceConditionStatus enum
doc.Components.Schemas["ResourceConditionStatus"] = &openapi3.SchemaRef{
Value: &openapi3.Schema{
Type: &openapi3.Types{"string"},
Enum: []interface{}{"True", "False"},
Description: "Status value for resource conditions",
},
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add "Unknown" to ResourceConditionStatus enum for Kubernetes compatibility.

ResourceConditionStatus (line 211) only includes "True" and "False", while AdapterConditionStatus (line 354) correctly includes "Unknown". Kubernetes metav1.ConditionStatus defines three values: True, False, and Unknown. For consistency and Kubernetes compatibility, add "Unknown" to ResourceConditionStatus.

Proposed fix
 	doc.Components.Schemas["ResourceConditionStatus"] = &openapi3.SchemaRef{
 		Value: &openapi3.Schema{
 			Type:        &openapi3.Types{"string"},
-			Enum:        []interface{}{"True", "False"},
+			Enum:        []interface{}{"True", "False", "Unknown"},
 			Description: "Status value for resource conditions",
 		},
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// ResourceConditionStatus enum
doc.Components.Schemas["ResourceConditionStatus"] = &openapi3.SchemaRef{
Value: &openapi3.Schema{
Type: &openapi3.Types{"string"},
Enum: []interface{}{"True", "False"},
Description: "Status value for resource conditions",
},
}
// ResourceConditionStatus enum
doc.Components.Schemas["ResourceConditionStatus"] = &openapi3.SchemaRef{
Value: &openapi3.Schema{
Type: &openapi3.Types{"string"},
Enum: []interface{}{"True", "False", "Unknown"},
Description: "Status value for resource conditions",
},
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/openapi/common.go` around lines 207 - 214, ResourceConditionStatus
currently defines only "True" and "False"; update the enum for
doc.Components.Schemas["ResourceConditionStatus"] to include "Unknown" as well
so it matches Kubernetes metav1.ConditionStatus and AdapterConditionStatus;
locate the SchemaRef creation for ResourceConditionStatus and add "Unknown" to
the Enum slice alongside "True" and "False".

Comment on lines +156 to +157
dbSession := dbFactory.New(context.Background())
err = dbSession.Model(resource).Update("status_conditions", conditionsJSON).Error
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use context-bound DB sessions in factory mutations.

At Line 156, Line 184, and Line 212, sessions are created directly with dbFactory.New(context.Background()) instead of using the context-based DB session pattern used by this repo.

As per coding guidelines: "Always retrieve database sessions from context via db.NewContext(ctx) rather than creating new GORM connections directly".

Also applies to: 184-185, 212-213

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/factories/resources.go` around lines 156 - 157, Replace direct calls to
dbFactory.New(context.Background()) with context-bound sessions obtained via
db.NewContext(ctx): ensure the factory functions accept or use the existing
context variable (ctx) and call db.NewContext(ctx) instead of
dbFactory.New(context.Background()); update the places where dbSession is used
(e.g., the dbSession.Model(resource).Update("status_conditions",
conditionsJSON).Error and similar mutations) to use the new context-bound
dbSession, and remove any direct creation of GORM connections so all DB
mutations use db.NewContext(ctx).

- Wire GET/POST /{id}/statuses endpoints for adapter status management
- Register AdapterStatusService in service registry for plugin access
- Fix Create to not propagate status aggregation errors after successful persist
- Fix Replace to preserve generation from existing record when spec unchanged

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
pkg/handlers/resource.go (1)

365-368: ⚠️ Potential issue | 🟠 Major

Build owner_href with owner plural, not child plural.

ownerHref currently uses h.plural (child resource collection), producing incorrect owner links.

Suggested patch
-		ownerHref := fmt.Sprintf("/api/hyperfleet/v1/%s/%s", h.plural, ownerID) // Simplified, adjust as needed
+		ownerHref := fmt.Sprintf("/api/hyperfleet/v1/%s/%s", getOwnerPlural(h.ownerKind), ownerID)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/handlers/resource.go` around lines 365 - 368, The ownerHref is being
built with the child's plural (h.plural) which yields incorrect owner links;
change the URL to use the owner's collection plural (e.g., h.ownerPlural)
instead of h.plural — update the ownerHref construction in the block that checks
h.isOwned (where ownerID := mux.Vars(r)[h.ownerPathParam] and resource.OwnerID
is set) to fmt.Sprintf("/api/hyperfleet/v1/%s/%s", h.ownerPlural, ownerID); if
h.ownerPlural does not exist add a field or accessor that returns the owner's
plural name and use that.
🧹 Nitpick comments (1)
pkg/services/resource.go (1)

204-206: Log malformed adapter condition payloads before skipping.

Skipping silently makes production debugging harder when adapters send invalid condition JSON.

Suggested patch
 		if err := json.Unmarshal(adapterStatus.Conditions, &conditions); err != nil {
-			continue // Skip if can't unmarshal
+			ctx = logger.WithResourceID(ctx, resourceID)
+			ctx = logger.WithResourceType(ctx, kind)
+			logger.WithError(ctx, err).Warn("Skipping adapter status with invalid conditions payload")
+			continue
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/services/resource.go` around lines 204 - 206, When
json.Unmarshal(adapterStatus.Conditions, &conditions) fails, don't silently
continue: log the unmarshalling error and the malformed payload plus identifying
adapter fields before skipping. Update the error handling in the block around
json.Unmarshal to call the existing logger (e.g., r.logger or the package
logger) with the error, adapterStatus.Adapter and other identifying fields (like
adapterStatus.Name/Namespace or the raw adapterStatus.Conditions string) so
operators can see which adapter sent invalid JSON, then continue.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/handlers/resource.go`:
- Around line 306-313: The code sets AdapterStatus.LastReportTime from
client-provided req.ObservedTime which is API-managed and can be spoofed; update
the AdapterStatus construction (where now := req.ObservedTime and the
AdapterStatus literal is built) to instead set LastReportTime from server time
(e.g., time.Now()/time.Now().UTC()) and remove reliance on req.ObservedTime so
the server controls the timestamp; keep AdapterStatus, LastReportTime,
req.ObservedTime and id references to locate and modify the block in
pkg/handlers/resource.go.
- Around line 258-266: ListStatuses currently uses only mux.Vars(r)["id"] and
calls h.adapterStatus.FindByResourcePaginated, allowing access across owner
paths; update ListStatuses to first read owner := mux.Vars(r)["owner"] (if
present) and, for owned resource kinds, fetch the resource (e.g., via
h.adapterResource.FindByID or a GetOwner helper) and compare the resource's
owner to the owner path param, returning a 404/403 ServiceError if they differ
before calling h.adapterStatus.FindByResourcePaginated; make the identical
owner-check change to the status creation handler referenced around the other
block (apply same owner validation for CreateStatus or corresponding method) so
both ListStatuses and the status-create path enforce owner scoping.

---

Duplicate comments:
In `@pkg/handlers/resource.go`:
- Around line 365-368: The ownerHref is being built with the child's plural
(h.plural) which yields incorrect owner links; change the URL to use the owner's
collection plural (e.g., h.ownerPlural) instead of h.plural — update the
ownerHref construction in the block that checks h.isOwned (where ownerID :=
mux.Vars(r)[h.ownerPathParam] and resource.OwnerID is set) to
fmt.Sprintf("/api/hyperfleet/v1/%s/%s", h.ownerPlural, ownerID); if
h.ownerPlural does not exist add a field or accessor that returns the owner's
plural name and use that.

---

Nitpick comments:
In `@pkg/services/resource.go`:
- Around line 204-206: When json.Unmarshal(adapterStatus.Conditions,
&conditions) fails, don't silently continue: log the unmarshalling error and the
malformed payload plus identifying adapter fields before skipping. Update the
error handling in the block around json.Unmarshal to call the existing logger
(e.g., r.logger or the package logger) with the error, adapterStatus.Adapter and
other identifying fields (like adapterStatus.Name/Namespace or the raw
adapterStatus.Conditions string) so operators can see which adapter sent invalid
JSON, then continue.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8e8d5be2-0d8b-408a-bd47-6a8e062caf15

📥 Commits

Reviewing files that changed from the base of the PR and between c3fa2c7 and 5fea203.

📒 Files selected for processing (3)
  • pkg/handlers/resource.go
  • pkg/services/resource.go
  • plugins/resources/plugin.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • plugins/resources/plugin.go

Comment on lines +258 to +266
func (h *ResourceHandler) ListStatuses(w http.ResponseWriter, r *http.Request) {
cfg := &handlerConfig{
Action: func() (interface{}, *errors.ServiceError) {
ctx := r.Context()
id := mux.Vars(r)["id"]
listArgs := services.NewListArguments(r.URL.Query())

statuses, total, svcErr := h.adapterStatus.FindByResourcePaginated(ctx, h.kind, id, listArgs)
if svcErr != nil {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Enforce owner scoping on status endpoints for owned resources.

For owned resources, status list/create currently use only {id} and ignore {owner} path validation. This allows cross-owner status access/update through mismatched owner paths.

Suggested patch
 func (h *ResourceHandler) ListStatuses(w http.ResponseWriter, r *http.Request) {
 	cfg := &handlerConfig{
 		Action: func() (interface{}, *errors.ServiceError) {
 			ctx := r.Context()
 			id := mux.Vars(r)["id"]
 			listArgs := services.NewListArguments(r.URL.Query())
+
+			if h.isOwned {
+				ownerID := mux.Vars(r)[h.ownerPathParam]
+				if _, svcErr := h.resource.GetByOwner(ctx, h.kind, ownerID, id); svcErr != nil {
+					return nil, svcErr
+				}
+			}
 
 			statuses, total, svcErr := h.adapterStatus.FindByResourcePaginated(ctx, h.kind, id, listArgs)
 			if svcErr != nil {
 				return nil, svcErr
 			}
@@
 	cfg := &handlerConfig{
 		&req,
 		[]validate{},
 		func() (interface{}, *errors.ServiceError) {
 			ctx := r.Context()
 			id := mux.Vars(r)["id"]
+
+			if h.isOwned {
+				ownerID := mux.Vars(r)[h.ownerPathParam]
+				if _, svcErr := h.resource.GetByOwner(ctx, h.kind, ownerID, id); svcErr != nil {
+					return nil, svcErr
+				}
+			}
 
 			conditionsJSON, err := json.Marshal(req.Conditions)
 			if err != nil {
 				return nil, errors.GeneralError("Failed to marshal conditions: %v", err)
 			}

Also applies to: 318-326

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/handlers/resource.go` around lines 258 - 266, ListStatuses currently uses
only mux.Vars(r)["id"] and calls h.adapterStatus.FindByResourcePaginated,
allowing access across owner paths; update ListStatuses to first read owner :=
mux.Vars(r)["owner"] (if present) and, for owned resource kinds, fetch the
resource (e.g., via h.adapterResource.FindByID or a GetOwner helper) and compare
the resource's owner to the owner path param, returning a 404/403 ServiceError
if they differ before calling h.adapterStatus.FindByResourcePaginated; make the
identical owner-check change to the status creation handler referenced around
the other block (apply same owner validation for CreateStatus or corresponding
method) so both ListStatuses and the status-create path enforce owner scoping.

Comment on lines +306 to +313
now := req.ObservedTime
adapterStatus := &api.AdapterStatus{
ResourceType: h.kind,
ResourceID: id,
Adapter: req.Adapter,
ObservedGeneration: req.ObservedGeneration,
LastReportTime: &now,
Conditions: conditionsJSON,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Do not take last_report_time from client input.

LastReportTime is API-managed, but this path sets it from req.ObservedTime. Use server time for consistency and to prevent spoofed chronology.

Suggested patch
-			now := req.ObservedTime
+			now := time.Now().UTC()
 			adapterStatus := &api.AdapterStatus{
 				ResourceType:       h.kind,
 				ResourceID:         id,
 				Adapter:            req.Adapter,
 				ObservedGeneration: req.ObservedGeneration,
 				LastReportTime:     &now,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
now := req.ObservedTime
adapterStatus := &api.AdapterStatus{
ResourceType: h.kind,
ResourceID: id,
Adapter: req.Adapter,
ObservedGeneration: req.ObservedGeneration,
LastReportTime: &now,
Conditions: conditionsJSON,
now := time.Now().UTC()
adapterStatus := &api.AdapterStatus{
ResourceType: h.kind,
ResourceID: id,
Adapter: req.Adapter,
ObservedGeneration: req.ObservedGeneration,
LastReportTime: &now,
Conditions: conditionsJSON,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/handlers/resource.go` around lines 306 - 313, The code sets
AdapterStatus.LastReportTime from client-provided req.ObservedTime which is
API-managed and can be spoofed; update the AdapterStatus construction (where now
:= req.ObservedTime and the AdapterStatus literal is built) to instead set
LastReportTime from server time (e.g., time.Now()/time.Now().UTC()) and remove
reliance on req.ObservedTime so the server controls the timestamp; keep
AdapterStatus, LastReportTime, req.ObservedTime and id references to locate and
modify the block in pkg/handlers/resource.go.

- Add CRD governance validator (pkg/crd/validator.go) enforcing 11 rules:
  group, labels, scope, owner, adapters, version, schema, naming conventions
- Strip status conditions boilerplate from CRD YAMLs (auto-injected at runtime)
- Define explicit spec properties in all 3 CRDs, removing opaque
  x-kubernetes-preserve-unknown-fields from spec level
- Replace hardcoded validation middleware with registry-aware dynamic matching
- Fix deployment secret mount path (/build/secrets -> /app/secrets)
- Fix config file resolution to use os.Getwd() instead of runtime.Caller(0)
  which breaks with -trimpath build flag

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@openshift-ci
Copy link

openshift-ci bot commented Mar 4, 2026

@ciaranRoche: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/images-images 1a2f704 link true /test images-images
ci/prow/unit 1a2f704 link true /test unit
ci/prow/presubmits-integration 1a2f704 link true /test presubmits-integration
ci/prow/lint 1a2f704 link true /test lint

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
pkg/crd/registry.go (2)

306-311: ⚠️ Potential issue | 🔴 Critical

Enum conversion preserves JSON quotes and can break schema validation.

string(e.Raw) keeps JSON literal formatting for string enums (e.g. "OIDC" including quotes), which can produce incorrect enum values in generated schemas.

🔧 Proposed fix
 import (
 	"context"
+	"encoding/json"
 	"fmt"
 	"os"
@@
 	if len(schema.Enum) > 0 {
 		enumValues := make([]interface{}, len(schema.Enum))
 		for i, e := range schema.Enum {
-			enumValues[i] = string(e.Raw)
+			var v interface{}
+			if err := json.Unmarshal(e.Raw, &v); err == nil {
+				enumValues[i] = v
+			} else {
+				enumValues[i] = string(e.Raw)
+			}
 		}
 		result["enum"] = enumValues
 	}

Run this to verify the behavior and the current code path:

#!/bin/bash
# 1) Confirm current enum conversion implementation.
rg -n 'enumValues\[i\]\s*=\s*string\(e\.Raw\)' pkg/crd/registry.go

# 2) Demonstrate why direct conversion keeps quotes for string enums.
python - <<'PY'
import json
raw = json.dumps("OIDC").encode("utf-8")
print("raw bytes:", raw)
print("string form:", raw.decode("utf-8"))  # expected: "OIDC" (with quotes)
PY
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/crd/registry.go` around lines 306 - 311, The enum conversion is using
string(e.Raw) which preserves JSON quotes (e.g. "\"OIDC\""); instead, parse
e.Raw as JSON into a Go value and store that value in enumValues so string enums
become plain Go strings. Update the loop over schema.Enum (the variables
enumValues, schema.Enum, e.Raw) to json.Unmarshal(e.Raw, &v) for each entry and
set enumValues[i] = v; on unmarshal error fall back to stripping surrounding
quotes from string(e.Raw) or otherwise convert safely before assigning to
result["enum"].

71-73: ⚠️ Potential issue | 🟠 Major

Reset registry state before each load operation.

Both load paths repopulate in-place without clearing prior state. Repeated loads can produce stale map entries and duplicate r.all items.

🔧 Proposed fix
 type Registry struct {
 	mu       sync.RWMutex
 	byKind   map[string]*api.ResourceDefinition
 	byPlural map[string]*api.ResourceDefinition
 	all      []*api.ResourceDefinition
 }
+
+func (r *Registry) resetLocked() {
+	r.byKind = make(map[string]*api.ResourceDefinition)
+	r.byPlural = make(map[string]*api.ResourceDefinition)
+	r.all = make([]*api.ResourceDefinition, 0)
+}
@@
 func (r *Registry) LoadFromKubernetes(ctx context.Context) error {
 	r.mu.Lock()
 	defer r.mu.Unlock()
+	r.resetLocked()
@@
 func (r *Registry) LoadFromDirectory(dir string) error {
 	r.mu.Lock()
 	defer r.mu.Unlock()
+	r.resetLocked()

Also applies to: 124-126

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/crd/registry.go` around lines 71 - 73, Reset the Registry internal state
before repopulating to avoid stale/duplicate entries: inside
Registry.LoadFromKubernetes (while holding r.mu) clear r.all (set to nil or
empty slice) and reinitialize any internal maps used for indexing (e.g.,
r.byName, r.byID or similar) before you append/populate; apply the same reset at
the start of the other load path that repopulates r.all (the second loader that
touches r.all around the 124-126 region).
cmd/hyperfleet-api/server/routes.go (1)

129-152: ⚠️ Potential issue | 🟠 Major

Avoid fail-open startup when schema validator initialization fails.

Continuing with validation disabled can let invalid spec payloads pass silently. This should fail startup unless explicitly configured otherwise.

🔧 Proposed fix
 	if schemaPath != "" {
 		// Production: Load schema from file (Helm sets OPENAPI_SCHEMA_PATH=/etc/hyperfleet/schemas/openapi.yaml)
 		schemaValidator, err = validators.NewSchemaValidator(schemaPath)
 		if err != nil {
-			logger.With(ctx, logger.FieldSchemaPath, schemaPath).WithError(err).Warn("Failed to load schema validator from file")
+			check(fmt.Errorf("failed to load schema validator from %s: %w", schemaPath, err),
+				"Unable to initialize schema validator from file")
 		} else {
 			logger.With(ctx, logger.FieldSchemaPath, schemaPath).Info("Schema validation enabled from file")
 		}
 	} else {
 		// Default: Generate schema dynamically from CRD registry
 		spec := openapi.GenerateSpec(crd.DefaultRegistry())
 		schemaValidator, err = validators.NewSchemaValidatorFromSpec(spec)
 		if err != nil {
-			logger.WithError(ctx, err).Warn("Failed to create schema validator from generated spec")
+			check(err, "Unable to initialize schema validator from generated spec")
 		} else {
 			logger.Info(ctx, "Schema validation enabled from dynamically generated spec")
 		}
 	}
-
-	if schemaValidator != nil {
-		router.Use(middleware.SchemaValidationMiddleware(schemaValidator, crd.DefaultRegistry()))
-	} else {
-		logger.Warn(ctx, "Schema validation is disabled. Spec fields will not be validated.")
-	}
+	router.Use(middleware.SchemaValidationMiddleware(schemaValidator, crd.DefaultRegistry()))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/hyperfleet-api/server/routes.go` around lines 129 - 152, The startup
currently continues with schema validation disabled when
validators.NewSchemaValidator or validators.NewSchemaValidatorFromSpec fails;
change this so initialization failure causes startup to abort unless an explicit
opt-out flag is set. Specifically, after attempting to create schemaValidator
(from schemaPath via NewSchemaValidator or from spec via
NewSchemaValidatorFromSpec), if schemaValidator == nil and an explicit
config/env flag (e.g., ALLOW_SCHEMA_VALIDATION_FAIL_OPEN or a boolean in your
server config) is not true, log the error with context and stop startup (return
an error from the routes setup or call a fatal/exit path) instead of calling
router.Use or falling back to disabled validation; only allow continuing with
router.Use(middleware.SchemaValidationMiddleware(...)) when schemaValidator is
non-nil or the opt-out flag is explicitly enabled.
🧹 Nitpick comments (1)
pkg/middleware/schema_validation_test.go (1)

483-500: Don’t ignore registry.Register errors in test setup.

Silently discarding registration failures can mask broken fixtures and make tests pass for the wrong reason.

🔧 Proposed fix
-	_ = registry.Register(&api.ResourceDefinition{
+	if err := registry.Register(&api.ResourceDefinition{
 		Kind:    "Cluster",
 		Plural:  "clusters",
 		Scope:   api.ResourceScopeRoot,
 		Enabled: true,
-	})
-	_ = registry.Register(&api.ResourceDefinition{
+	}); err != nil {
+		t.Fatalf("register Cluster: %v", err)
+	}
+	if err := registry.Register(&api.ResourceDefinition{
 		Kind:    "NodePool",
 		Plural:  "nodepools",
 		Scope:   api.ResourceScopeOwned,
 		Enabled: true,
-	})
-	_ = registry.Register(&api.ResourceDefinition{
+	}); err != nil {
+		t.Fatalf("register NodePool: %v", err)
+	}
+	if err := registry.Register(&api.ResourceDefinition{
 		Kind:    "IDP",
 		Plural:  "idps",
 		Scope:   api.ResourceScopeOwned,
 		Enabled: true,
-	})
+	}); err != nil {
+		t.Fatalf("register IDP: %v", err)
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/middleware/schema_validation_test.go` around lines 483 - 500, The test
setup is swallowing errors from registry.Register when registering
ResourceDefinition for "Cluster", "NodePool", and "IDP"; change those calls to
check the returned error and fail the test on error (e.g., if err :=
registry.Register(...); err != nil { t.Fatalf("registry.Register(%s): %v",
"<Kind>", err) } or use require.NoError/Assert in the test) so registration
failures surface immediately; update each registry.Register invocation for
Cluster, NodePool and IDP to validate the error and fail the test when non-nil.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/middleware/schema_validation.go`:
- Around line 137-145: Add a nil guard for the registry parameter before
iterating segments: ensure you check if registry == nil (or otherwise invalid)
and return a safe default (e.g., false and empty kind) instead of calling
registry.GetByPlural(seg); update the logic around the for loop that inspects
segments and the call to registry.GetByPlural(seg) so the function
short-circuits when registry is nil to avoid a panic.

---

Duplicate comments:
In `@cmd/hyperfleet-api/server/routes.go`:
- Around line 129-152: The startup currently continues with schema validation
disabled when validators.NewSchemaValidator or
validators.NewSchemaValidatorFromSpec fails; change this so initialization
failure causes startup to abort unless an explicit opt-out flag is set.
Specifically, after attempting to create schemaValidator (from schemaPath via
NewSchemaValidator or from spec via NewSchemaValidatorFromSpec), if
schemaValidator == nil and an explicit config/env flag (e.g.,
ALLOW_SCHEMA_VALIDATION_FAIL_OPEN or a boolean in your server config) is not
true, log the error with context and stop startup (return an error from the
routes setup or call a fatal/exit path) instead of calling router.Use or falling
back to disabled validation; only allow continuing with
router.Use(middleware.SchemaValidationMiddleware(...)) when schemaValidator is
non-nil or the opt-out flag is explicitly enabled.

In `@pkg/crd/registry.go`:
- Around line 306-311: The enum conversion is using string(e.Raw) which
preserves JSON quotes (e.g. "\"OIDC\""); instead, parse e.Raw as JSON into a Go
value and store that value in enumValues so string enums become plain Go
strings. Update the loop over schema.Enum (the variables enumValues,
schema.Enum, e.Raw) to json.Unmarshal(e.Raw, &v) for each entry and set
enumValues[i] = v; on unmarshal error fall back to stripping surrounding quotes
from string(e.Raw) or otherwise convert safely before assigning to
result["enum"].
- Around line 71-73: Reset the Registry internal state before repopulating to
avoid stale/duplicate entries: inside Registry.LoadFromKubernetes (while holding
r.mu) clear r.all (set to nil or empty slice) and reinitialize any internal maps
used for indexing (e.g., r.byName, r.byID or similar) before you
append/populate; apply the same reset at the start of the other load path that
repopulates r.all (the second loader that touches r.all around the 124-126
region).

---

Nitpick comments:
In `@pkg/middleware/schema_validation_test.go`:
- Around line 483-500: The test setup is swallowing errors from
registry.Register when registering ResourceDefinition for "Cluster", "NodePool",
and "IDP"; change those calls to check the returned error and fail the test on
error (e.g., if err := registry.Register(...); err != nil {
t.Fatalf("registry.Register(%s): %v", "<Kind>", err) } or use
require.NoError/Assert in the test) so registration failures surface
immediately; update each registry.Register invocation for Cluster, NodePool and
IDP to validate the error and fail the test when non-nil.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: df2941d7-461d-4e5f-81f8-6886738bb0ea

📥 Commits

Reviewing files that changed from the base of the PR and between 5fea203 and 1a2f704.

📒 Files selected for processing (11)
  • charts/crds/cluster-crd.yaml
  • charts/crds/idp-crd.yaml
  • charts/crds/nodepool-crd.yaml
  • charts/templates/deployment.yaml
  • cmd/hyperfleet-api/server/routes.go
  • pkg/config/config.go
  • pkg/crd/registry.go
  • pkg/crd/validator.go
  • pkg/crd/validator_test.go
  • pkg/middleware/schema_validation.go
  • pkg/middleware/schema_validation_test.go

Comment on lines +137 to 145
for i := len(segments) - 1; i >= 0; i-- {
seg := segments[i]
if seg == "" {
continue
}
if def, found := registry.GetByPlural(seg); found {
return true, strings.ToLower(def.Kind)
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Consider adding a nil guard for the registry parameter.

If registry is nil, calling registry.GetByPlural(seg) on line 142 will cause a panic. While the middleware may only be attached when a valid registry exists, adding a defensive nil check improves robustness and makes the function safer to call in isolation.

🛡️ Proposed fix to add nil guard
 func shouldValidateRequest(method, path string, registry *crd.Registry) (bool, string) {
 	// Only validate POST and PATCH requests
 	if method != http.MethodPost && method != http.MethodPatch {
 		return false, ""
 	}

+	// Skip if no registry available
+	if registry == nil {
+		return false, ""
+	}
+
 	// Skip status subresource paths (handled separately)
 	if strings.HasSuffix(path, "/statuses") {
 		return false, ""
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/middleware/schema_validation.go` around lines 137 - 145, Add a nil guard
for the registry parameter before iterating segments: ensure you check if
registry == nil (or otherwise invalid) and return a safe default (e.g., false
and empty kind) instead of calling registry.GetByPlural(seg); update the logic
around the for loop that inspects segments and the call to
registry.GetByPlural(seg) so the function short-circuits when registry is nil to
avoid a panic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants