Separate CRDs Helm chart for upgradeable CRD lifecycle#208
Separate CRDs Helm chart for upgradeable CRD lifecycle#208
Conversation
…ions are out of sync but within 'compatible' window
|
|
||
| Helm's `crds/` directory installs CRDs on `helm install` but **silently ignores them on `helm upgrade`** and does not delete them on `helm uninstall`. This means users have no supported mechanism to upgrade CRDs when upgrading the controller chart via the standard Helm workflow. | ||
|
|
||
| To provide an explicit, version-tracked CRD upgrade path, the temporal-worker-controller ships CRDs as a separate Helm chart: `temporal-worker-controller-crds`. This is the same pattern used by [Karpenter](https://karpenter.sh/docs/upgrading/upgrade-guide/) (`karpenter-crd`), [prometheus-operator-crds](https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus-operator-crds), and [VictoriaMetrics](https://docs.victoriametrics.com/helm/victoria-metrics-operator-crds/) (`victoria-metrics-operator-crds`). |
There was a problem hiding this comment.
nit: personally never heard of VictoriaMetrics, and not sure it is needed to have this bit of information here as Prometheus is a strong enough example imo
| @@ -0,0 +1,5 @@ | |||
| apiVersion: v2 | |||
| name: temporal-worker-controller-crds | |||
| description: CRDs for the Temporal Worker Controller. Install this chart before the temporal-worker-controller chart. | |||
There was a problem hiding this comment.
if we do independently version our worker-controller and the corresponding crd's, it's quite easy for both of them to drift apart imho. There shouldn't be a world where we have the user doing math and looking at a combinatorial matrix to understand which crd version is compatible with which controller version.
wdyt about the idea of adding to the ci a validation step which shall check/enforce this?
| @@ -0,0 +1,147 @@ | |||
| # CRD Management | |||
There was a problem hiding this comment.
actually, just while reading this I began to wonder if we really do need this (maybe we do, and would be happy to be corrected). It seems to me that the main problem is that on a helm upgrade, CRD's don't upgrade and this could be annoying to users.
However, what if we just moved the crds into the helm chart of the controller itself? in that way, the helm upgrade command would work and we don't need to create a new chart too
There was a problem hiding this comment.
you could argue for us to actually then just version them independently, but I am struggling to think about why one would require independent versioning if there is a higher chance of these version mismatches happening down the line
There was a problem hiding this comment.
When someone rolls back the controller, in most cases, the safest thing is to not roll back the CRD. If the CRD is in the main helm chart itself, then rolling back would roll back the CRD definitions, and cause field loss.
Summary
temporal-worker-controller-crdsHelm chart with CRDs intemplates/(notcrds/) so they are upgraded onhelm upgradehelm.sh/resource-policy: keepto all CRDs so they survivehelm uninstallof either chartcrds/from the main chart; CRDs now live exclusively in the CRDs chartmake manifeststo output to the new chart'stemplates/and inject the annotation viaawkmake install,make uninstall, andmake deployto use the new pathshelm-lint-crdsandhelm-template-crdsjobs tohelm-validate.ymlCI gaterelease.ymlandhelm.ymlto bump both chart versions and push both charts on releasedocs/crd-management.mdwith compatibility commitment, install/upgrade/rollback instructions, and a migration guide for existing usersREADME.mdinstallation section to document the two-chart install flowUpgrade order
Rollback order
Migration for existing users (Controller Helm Chart v0.12.0 and earlier)
Helm does not delete CRDs when upgrading to the new chart, so upgrading the main chart is safe. A one-time
helm install temporal-worker-controller-crdsadopts the existing CRDs into the new release.Test plan
helm lint --strict helm/temporal-worker-controller-crdspasseshelm template test-release helm/temporal-worker-controller-crdsemits two CRDs withhelm.sh/resource-policy: keephelm template test-release helm/temporal-worker-controlleremits zero CRDshelm-validate.ymlpasses all 4 jobs (lint, template, lint-crds, template-crds)🤖 Generated with Claude Code