Skip to content

fix(chart): repair immutable Deployment selector on skyhook->nodewright upgrade (#286) (cherry-pick)#289

Merged
lockwobr merged 1 commit into
release/v0.17.xfrom
fix-chart-bump
Jun 26, 2026
Merged

fix(chart): repair immutable Deployment selector on skyhook->nodewright upgrade (#286) (cherry-pick)#289
lockwobr merged 1 commit into
release/v0.17.xfrom
fix-chart-bump

Conversation

@lockwobr

@lockwobr lockwobr commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

helm upgrade fails with "spec.selector: field is immutable" when a release first installed from the pre-rename skyhook-operator chart preserves the controller- manager Deployment name across the rename (via fullnameOverride/nameOverride, as AICR does). The selector's app.kubernetes.io/name is sourced from the chart name, which changed skyhook-operator -> nodewright, and spec.selector is immutable.

  • source the controller-manager selector from a shared chart.managerSelectorLabels helper used by both the Deployment and the new migration hook
  • add a smart pre-upgrade hook (Job + least-privilege ServiceAccount/Role/ RoleBinding, gated by selectorMigration.enabled, default on) that deletes the Deployment only when its live selector does not match the desired one, so helm recreates it; verified no-op (no downtime) on normal upgrades
  • migrate the maintenance-job image off the now-paywalled bitnami/kubectl to alpine/kubectl:1.36.2 pinned by multi-arch digest (closes Migrate bitnami/kubectl cleanup-job image to a maintained, versioned alternative #207)
  • add helm-template chainsaw asserts for the selector pin and each hook resource
  • docs: add chart RELEASE_NOTES, update release-process image example and ROADMAP

Fixes #285
Closes #207

Description

Checklist

  • I am familiar with the Contributing Guidelines.
  • My commits are signed off (git commit -s) per the DCO.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

…ht upgrade (#286)

helm upgrade fails with "spec.selector: field is immutable" when a release first
installed from the pre-rename skyhook-operator chart preserves the controller-
manager Deployment name across the rename (via fullnameOverride/nameOverride, as
AICR does). The selector's app.kubernetes.io/name is sourced from the chart name,
which changed skyhook-operator -> nodewright, and spec.selector is immutable.

- source the controller-manager selector from a shared chart.managerSelectorLabels
  helper used by both the Deployment and the new migration hook
- add a smart pre-upgrade hook (Job + least-privilege ServiceAccount/Role/
  RoleBinding, gated by selectorMigration.enabled, default on) that deletes the
  Deployment only when its live selector does not match the desired one, so helm
  recreates it; verified no-op (no downtime) on normal upgrades
- migrate the maintenance-job image off the now-paywalled bitnami/kubectl to
  alpine/kubectl:1.36.2 pinned by multi-arch digest (closes #207)
- add helm-template chainsaw asserts for the selector pin and each hook resource
- docs: add chart RELEASE_NOTES, update release-process image example and ROADMAP

Fixes #285
Closes #207

Signed-off-by: Brian Lockwood <lockwobr@gmail.com>
@lockwobr lockwobr self-assigned this Jun 26, 2026
@lockwobr lockwobr requested a review from a team June 26, 2026 18:51
@lockwobr lockwobr enabled auto-merge (squash) June 26, 2026 18:52
@github-actions github-actions Bot added doc Documentation change (PR path label; doc issues use the Documentation type) component/operator Skyhook operator (controller-manager) component/chart Helm chart component/ci CI workflows, GitHub Actions, and repo tooling component/tests End-to-end / chainsaw test suites (k8s-tests) labels Jun 26, 2026
@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

The chart adds a shared selector-label helper and updates the controller-manager Deployment to use it. It introduces an optional pre-upgrade selector-migration Job with its own ServiceAccount, Role, and RoleBinding that deletes the Deployment when live selector labels differ from the desired set. The cleanup Jobs switch to alpine/kubectl and add writable cache mounts and KUBECACHEDIR, and the docs and Chainsaw tests were updated accordingly.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • NVIDIA/nodewright#286: The main PR makes the same Helm selector-migration changes as the retrieved PR—adding chart.managerSelectorLabels, updating the controller-manager Deployment selector to use it, introducing the pre-upgrade selector-migration Job/SA/Role/RoleBinding, switching cleanup/migration kubectl images to pinned alpine/kubectl, and updating the selector-migration Helm template tests and release docs.

Suggested labels

doc, component/chart, component/tests

Suggested reviewers

  • ayuskauskas
  • rice-riley
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly matches the main fix: repairing the Deployment selector across the skyhook-to-nodewright rename.
Description check ✅ Passed The description is directly related to the selector fix, migration hook, image migration, tests, and docs changes.
Linked Issues check ✅ Passed The changes satisfy #285 and #207 by adding the shared selector helper, gated pre-upgrade migration hook, and a pinned kubectl replacement.
Out of Scope Changes check ✅ Passed The added release notes, docs updates, and tests all support the linked selector and kubectl migration work.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix-chart-bump

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 Trivy (0.69.3)

Trivy execution failed: 2026-06-26T18:51:52Z FATAL Fatal error run error: fs scan error: scan error: scan failed: failed analysis: post analysis error: post analysis error: azure-arm scan error: fs filter error: fs filter error: walk error range error: stat smartylint.json: no such file or directory: range error: stat smartylint.json: no such file or directory


Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@chart/RELEASE_NOTES.md`:
- Around line 35-50: Update the release note in RELEASE_NOTES.md to explicitly
mention the brief controller-manager reconciliation gap during the automatic
selector migration path. In the section describing selectorMigration and the
helm upgrade behavior, clarify that chart/templates/selector-migration-job.yaml
uses a plain kubectl delete deployment, which temporarily removes the running
manager until Helm recreates it. Keep the existing manual-recreate guidance, but
add a short warning near the “No action is required” text so operators know to
expect the transient control-plane gap.

In `@k8s-tests/chainsaw/helm/helm-template-test/chainsaw-test.yaml`:
- Around line 145-208: Add a regression test for nameOverride in
chainsaw-test.yaml alongside the existing fullnameOverride case. The current
assertions in the override-keeps-name-and-aligns-hook scenario only cover
fullnameOverride, but the immutable selector and hook/RBAC behavior also depend
on chart.name-derived labels. Add a new render/assertion path that templates
deployment.yaml, selector-migration-role.yaml, and selector-migration-job.yaml
with nameOverride set, and verify the Deployment selector, Role resourceNames,
and hook target still match the expected overridden/preserved names.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 26213de8-4e63-4a0f-9149-7e18f6159d8a

📥 Commits

Reviewing files that changed from the base of the PR and between 27c8f99 and c4082dc.

📒 Files selected for processing (13)
  • ROADMAP.md
  • chart/RELEASE_NOTES.md
  • chart/templates/_helpers.tpl
  • chart/templates/cleanup-skyhooks-job.yaml
  • chart/templates/cleanup-webhook-job.yaml
  • chart/templates/deployment.yaml
  • chart/templates/selector-migration-job.yaml
  • chart/templates/selector-migration-role.yaml
  • chart/templates/selector-migration-rolebinding.yaml
  • chart/templates/selector-migration-serviceaccount.yaml
  • chart/values.yaml
  • docs/release-process.md
  • k8s-tests/chainsaw/helm/helm-template-test/chainsaw-test.yaml

Comment thread chart/RELEASE_NOTES.md
Comment thread k8s-tests/chainsaw/helm/helm-template-test/chainsaw-test.yaml
@lockwobr lockwobr merged commit 30cebda into release/v0.17.x Jun 26, 2026
39 checks passed
@lockwobr lockwobr deleted the fix-chart-bump branch June 26, 2026 19:10
@coveralls

Copy link
Copy Markdown

Coverage Report for CI Build 28258679324

Warning

Build has drifted: This PR's base is out of sync with its target branch, so coverage data may include unrelated changes.
Quick fix: rebase this PR. Learn more →

Warning

No base build found for commit 27c8f99 on release/v0.17.x.
Coverage changes can't be calculated without a base build.
If a base build is processing, this comment will update automatically when it completes.

Coverage: 81.703%

Details

  • Patch coverage: No coverable lines changed in this PR.

Uncovered Changes

No uncovered changes found.

Coverage Regressions

Requires a base build to compare against. How to fix this →


Coverage Stats

Coverage Status
Relevant Lines: 9914
Covered Lines: 8100
Line Coverage: 81.7%
Coverage Strength: 9.31 hits per line

💛 - Coveralls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component/chart Helm chart component/ci CI workflows, GitHub Actions, and repo tooling component/operator Skyhook operator (controller-manager) component/tests End-to-end / chainsaw test suites (k8s-tests) doc Documentation change (PR path label; doc issues use the Documentation type)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants