Skip to content

Separate CRDs Helm chart for upgradeable CRD lifecycle#208

Open
carlydf wants to merge 3 commits intomainfrom
cdf/crd-chart
Open

Separate CRDs Helm chart for upgradeable CRD lifecycle#208
carlydf wants to merge 3 commits intomainfrom
cdf/crd-chart

Conversation

@carlydf
Copy link
Collaborator

@carlydf carlydf commented Feb 26, 2026

Summary

  • Creates a new temporal-worker-controller-crds Helm chart with CRDs in templates/ (not crds/) so they are upgraded on helm upgrade
  • Adds helm.sh/resource-policy: keep to all CRDs so they survive helm uninstall of either chart
  • Removes crds/ from the main chart; CRDs now live exclusively in the CRDs chart
  • Updates make manifests to output to the new chart's templates/ and inject the annotation via awk
  • Updates make install, make uninstall, and make deploy to use the new paths
  • Adds helm-lint-crds and helm-template-crds jobs to helm-validate.yml CI gate
  • Updates release.yml and helm.yml to bump both chart versions and push both charts on release
  • Adds docs/crd-management.md with compatibility commitment, install/upgrade/rollback instructions, and a migration guide for existing users
  • Updates README.md installation section to document the two-chart install flow

Upgrade order

# Install / upgrade CRDs first, then the controller
helm upgrade temporal-worker-controller-crds ...
helm upgrade temporal-worker-controller ...

Rollback order

# Roll back controller first, CRDs optionally afterward
helm rollback temporal-worker-controller
helm rollback temporal-worker-controller-crds  # usually not needed

Migration for existing users (Controller Helm Chart v0.12.0 and earlier)

Helm does not delete CRDs when upgrading to the new chart, so upgrading the main chart is safe. A one-time helm install temporal-worker-controller-crds adopts the existing CRDs into the new release.

Test plan

  • helm lint --strict helm/temporal-worker-controller-crds passes
  • helm template test-release helm/temporal-worker-controller-crds emits two CRDs with helm.sh/resource-policy: keep
  • helm template test-release helm/temporal-worker-controller emits zero CRDs
  • CI helm-validate.yml passes all 4 jobs (lint, template, lint-crds, template-crds)

🤖 Generated with Claude Code

@carlydf carlydf requested review from a team and jlegrone as code owners February 26, 2026 05:00

Helm's `crds/` directory installs CRDs on `helm install` but **silently ignores them on `helm upgrade`** and does not delete them on `helm uninstall`. This means users have no supported mechanism to upgrade CRDs when upgrading the controller chart via the standard Helm workflow.

To provide an explicit, version-tracked CRD upgrade path, the temporal-worker-controller ships CRDs as a separate Helm chart: `temporal-worker-controller-crds`. This is the same pattern used by [Karpenter](https://karpenter.sh/docs/upgrading/upgrade-guide/) (`karpenter-crd`), [prometheus-operator-crds](https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus-operator-crds), and [VictoriaMetrics](https://docs.victoriametrics.com/helm/victoria-metrics-operator-crds/) (`victoria-metrics-operator-crds`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: personally never heard of VictoriaMetrics, and not sure it is needed to have this bit of information here as Prometheus is a strong enough example imo

@@ -0,0 +1,5 @@
apiVersion: v2
name: temporal-worker-controller-crds
description: CRDs for the Temporal Worker Controller. Install this chart before the temporal-worker-controller chart.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we do independently version our worker-controller and the corresponding crd's, it's quite easy for both of them to drift apart imho. There shouldn't be a world where we have the user doing math and looking at a combinatorial matrix to understand which crd version is compatible with which controller version.

wdyt about the idea of adding to the ci a validation step which shall check/enforce this?

@@ -0,0 +1,147 @@
# CRD Management
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, just while reading this I began to wonder if we really do need this (maybe we do, and would be happy to be corrected). It seems to me that the main problem is that on a helm upgrade, CRD's don't upgrade and this could be annoying to users.

However, what if we just moved the crds into the helm chart of the controller itself? in that way, the helm upgrade command would work and we don't need to create a new chart too

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could argue for us to actually then just version them independently, but I am struggling to think about why one would require independent versioning if there is a higher chance of these version mismatches happening down the line

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When someone rolls back the controller, in most cases, the safest thing is to not roll back the CRD. If the CRD is in the main helm chart itself, then rolling back would roll back the CRD definitions, and cause field loss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants