fix(chart): repair immutable Deployment selector on skyhook->nodewright upgrade (#286) (cherry-pick)#289
Conversation
…ht upgrade (#286) helm upgrade fails with "spec.selector: field is immutable" when a release first installed from the pre-rename skyhook-operator chart preserves the controller- manager Deployment name across the rename (via fullnameOverride/nameOverride, as AICR does). The selector's app.kubernetes.io/name is sourced from the chart name, which changed skyhook-operator -> nodewright, and spec.selector is immutable. - source the controller-manager selector from a shared chart.managerSelectorLabels helper used by both the Deployment and the new migration hook - add a smart pre-upgrade hook (Job + least-privilege ServiceAccount/Role/ RoleBinding, gated by selectorMigration.enabled, default on) that deletes the Deployment only when its live selector does not match the desired one, so helm recreates it; verified no-op (no downtime) on normal upgrades - migrate the maintenance-job image off the now-paywalled bitnami/kubectl to alpine/kubectl:1.36.2 pinned by multi-arch digest (closes #207) - add helm-template chainsaw asserts for the selector pin and each hook resource - docs: add chart RELEASE_NOTES, update release-process image example and ROADMAP Fixes #285 Closes #207 Signed-off-by: Brian Lockwood <lockwobr@gmail.com>
📝 WalkthroughWalkthroughThe chart adds a shared selector-label helper and updates the controller-manager Deployment to use it. It introduces an optional pre-upgrade selector-migration Job with its own ServiceAccount, Role, and RoleBinding that deletes the Deployment when live selector labels differ from the desired set. The cleanup Jobs switch to Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 Trivy (0.69.3)Trivy execution failed: 2026-06-26T18:51:52Z FATAL Fatal error run error: fs scan error: scan error: scan failed: failed analysis: post analysis error: post analysis error: azure-arm scan error: fs filter error: fs filter error: walk error range error: stat smartylint.json: no such file or directory: range error: stat smartylint.json: no such file or directory Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@chart/RELEASE_NOTES.md`:
- Around line 35-50: Update the release note in RELEASE_NOTES.md to explicitly
mention the brief controller-manager reconciliation gap during the automatic
selector migration path. In the section describing selectorMigration and the
helm upgrade behavior, clarify that chart/templates/selector-migration-job.yaml
uses a plain kubectl delete deployment, which temporarily removes the running
manager until Helm recreates it. Keep the existing manual-recreate guidance, but
add a short warning near the “No action is required” text so operators know to
expect the transient control-plane gap.
In `@k8s-tests/chainsaw/helm/helm-template-test/chainsaw-test.yaml`:
- Around line 145-208: Add a regression test for nameOverride in
chainsaw-test.yaml alongside the existing fullnameOverride case. The current
assertions in the override-keeps-name-and-aligns-hook scenario only cover
fullnameOverride, but the immutable selector and hook/RBAC behavior also depend
on chart.name-derived labels. Add a new render/assertion path that templates
deployment.yaml, selector-migration-role.yaml, and selector-migration-job.yaml
with nameOverride set, and verify the Deployment selector, Role resourceNames,
and hook target still match the expected overridden/preserved names.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 26213de8-4e63-4a0f-9149-7e18f6159d8a
📒 Files selected for processing (13)
ROADMAP.mdchart/RELEASE_NOTES.mdchart/templates/_helpers.tplchart/templates/cleanup-skyhooks-job.yamlchart/templates/cleanup-webhook-job.yamlchart/templates/deployment.yamlchart/templates/selector-migration-job.yamlchart/templates/selector-migration-role.yamlchart/templates/selector-migration-rolebinding.yamlchart/templates/selector-migration-serviceaccount.yamlchart/values.yamldocs/release-process.mdk8s-tests/chainsaw/helm/helm-template-test/chainsaw-test.yaml
Coverage Report for CI Build 28258679324Warning Build has drifted: This PR's base is out of sync with its target branch, so coverage data may include unrelated changes. Warning No base build found for commit Coverage: 81.703%Details
Uncovered ChangesNo uncovered changes found. Coverage RegressionsRequires a base build to compare against. How to fix this → Coverage Stats
💛 - Coveralls |
helm upgrade fails with "spec.selector: field is immutable" when a release first installed from the pre-rename skyhook-operator chart preserves the controller- manager Deployment name across the rename (via fullnameOverride/nameOverride, as AICR does). The selector's app.kubernetes.io/name is sourced from the chart name, which changed skyhook-operator -> nodewright, and spec.selector is immutable.
bitnami/kubectlcleanup-job image to a maintained, versioned alternative #207)Fixes #285
Closes #207
Description
Checklist
git commit -s) per the DCO.