Skip to content

fix(chart): repair immutable Deployment selector on skyhook->nodewright upgrade#286

Merged
lockwobr merged 1 commit into
mainfrom
fix/chart-selector-migration
Jun 26, 2026
Merged

fix(chart): repair immutable Deployment selector on skyhook->nodewright upgrade#286
lockwobr merged 1 commit into
mainfrom
fix/chart-selector-migration

Conversation

@lockwobr

Copy link
Copy Markdown
Collaborator

What

Fixes the helm upgrade failure where Kubernetes rejects an immutable
spec.selector change introduced by the skyhook-operator -> nodewright chart
rename, and migrates the maintenance-job image off the now-paywalled
bitnami/kubectl.

Fixes #285. Closes #207. (Reported downstream in NVIDIA/aicr#1382.)

Why it fails (and when)

chart/templates/deployment.yaml sourced the controller-manager selector's
app.kubernetes.io/name from the chart name, which changed at v0.16.0
(skyhook-operator -> nodewright). Deployment.spec.selector is immutable.

The failure only triggers when the controller-manager Deployment name is
preserved
across the rename, forcing Helm to patch in place. That happens when
a release pins the name via fullnameOverride/nameOverride (e.g. AICR sets
fullnameOverride: skyhook-operator). Without a pinned name, Helm renders a
differently named Deployment and simply create+deletes it, so the upgrade
succeeds (silently replacing the operator) rather than erroring.

Changes

  • Shared selector helper. New chart.managerSelectorLabels in _helpers.tpl,
    used by both the Deployment selector and the migration hook so they reason
    about the identical label set. This does not change the selector's rendered
    value; it keeps the chart's existing three-label selector.
  • Smart pre-upgrade hook. New selector-migration-{job,serviceaccount,role, rolebinding}.yaml (gated by selectorMigration.enabled, default true). The
    Job reads the live Deployment selector and deletes it only when it does not
    match
    the desired selector (per-label jsonpath compare), letting Helm
    recreate it. It is a no-op (no downtime) on normal upgrades. It runs under its
    own least-privilege ServiceAccount/Role (get/list/delete on deployments),
    because during pre-upgrade the release's manager RBAC is not applied yet.
  • Image migration (Migrate bitnami/kubectl cleanup-job image to a maintained, versioned alternative #207). webhook.removalImage/Tag/Digest and the three
    maintenance Jobs move from bitnami/kubectl to
    alpine/kubectl:1.36.2 pinned by multi-arch digest.
  • Tests. helm-template-test gains structural chainsaw asserts for the
    selector pin and each hook resource (single-doc templates so assert can match
    by kind), plus a disabled-gate check.
  • Docs. New chart/RELEASE_NOTES.md (upgrade notes), updated
    docs/release-process.md image example, and ROADMAP.md.

New pattern (flagged per CONTRIBUTING)

This introduces the chart's first pre-upgrade hook with its own dedicated
hook RBAC (the existing hooks are pre-delete and reuse the controller-manager
SA). A dedicated SA is required because the new manager RBAC is not yet applied
during pre-upgrade, and the manager SA lacks deployment-delete permission. The
hook is modeled on the existing cleanup-*-job.yaml hooks otherwise.

chart/ vs operator/config

This pins the chart's existing three-label selector; it does not re-align to
operator/config/manager/manager.yaml, which intentionally stays narrower
(control-plane only). No kustomize mirror is needed (this is a Helm-only
install-time concern). Other selector-bearing resources were audited: the PDB
keys on chart.fullname (moves with its own name, not affected); Service /
NetworkPolicy selectors are mutable; CRD matchLabels are schema fields. The
Deployment selector was the only immutable-field instance.

Testing

  • Live e2e on kind (published skyhook-public/skyhook-operator v0.15.1 ->
    this chart, with fullnameOverride: skyhook-operator):
    • control (hook disabled) reproduces spec.selector ... field is immutable
    • hook enabled: upgrade succeeds, selector flips to nodewright, Deployment
      recreated
    • second upgrade: Deployment UID unchanged -> genuine no-op, no downtime
  • helm-template-test: passes (selector pin + all four hook resources asserted
    • disabled gate)
  • helm lint: clean

@lockwobr lockwobr requested a review from a team June 25, 2026 22:43
@github-actions github-actions Bot added doc Documentation change (PR path label; doc issues use the Documentation type) component/operator Skyhook operator (controller-manager) component/chart Helm chart component/ci CI workflows, GitHub Actions, and repo tooling component/tests End-to-end / chainsaw test suites (k8s-tests) labels Jun 25, 2026
@lockwobr lockwobr self-assigned this Jun 25, 2026
@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a shared controller-manager selector-label helper, updates the Deployment selector to use it, and introduces a default-enabled pre-upgrade Job with dedicated ServiceAccount, Role, and RoleBinding that compares live selector labels before deleting the Deployment when they differ. The chart also switches cleanup jobs to pinned alpine/kubectl images, updates values and docs, and extends Helm template tests.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

  • ayuskauskas
  • rice-riley
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main fix: handling the immutable Deployment selector during the skyhook-to-nodewright upgrade.
Description check ✅ Passed The description matches the PR by explaining the selector-migration hook, kubectl image migration, tests, and docs updates.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/chart-selector-migration

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/release-process.md`:
- Around line 306-308: The follow-up instruction is stale and references the
wrong image updates. In the release-process guidance near the maintenance-job
`alpine/kubectl` note, update the next sentence to describe the correct action
for `webhook.removalTag` and keep it aligned with the supported Kubernetes
range, rather than telling readers to change the `kube-rbac-proxy`, operator,
and agent digests. Use the existing `webhook.removalTag` and
`docs/kubernetes-support.md` references to keep the wording tied to the right
config.

In `@k8s-tests/chainsaw/helm/helm-template-test/chainsaw-test.yaml`:
- Around line 175-186: The helm template test only checks that the rendered hook
image comes from alpine/kubectl, so it can still pass if the image is unpinned.
Tighten the assertions in the chainsaw test script by updating the check near
the helm template command to also verify the digest pin, either by matching the
full rendered image or by asserting the output contains `@sha256`:, while keeping
the existing repository-switch checks intact.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 900b7716-ea05-4381-817d-b09c5bc028fe

📥 Commits

Reviewing files that changed from the base of the PR and between a1f3bec and e188e55.

📒 Files selected for processing (13)
  • ROADMAP.md
  • chart/RELEASE_NOTES.md
  • chart/templates/_helpers.tpl
  • chart/templates/cleanup-skyhooks-job.yaml
  • chart/templates/cleanup-webhook-job.yaml
  • chart/templates/deployment.yaml
  • chart/templates/selector-migration-job.yaml
  • chart/templates/selector-migration-role.yaml
  • chart/templates/selector-migration-rolebinding.yaml
  • chart/templates/selector-migration-serviceaccount.yaml
  • chart/values.yaml
  • docs/release-process.md
  • k8s-tests/chainsaw/helm/helm-template-test/chainsaw-test.yaml

Comment thread docs/release-process.md Outdated
Comment thread k8s-tests/chainsaw/helm/helm-template-test/chainsaw-test.yaml
@coveralls

coveralls commented Jun 25, 2026

Copy link
Copy Markdown

Coverage Report for CI Build 28252625797

Coverage decreased (-0.02%) to 81.683%

Details

  • Coverage decreased (-0.02%) from the base build.
  • Patch coverage: No coverable lines changed in this PR.
  • 2 coverage regressions across 1 file.

Uncovered Changes

No uncovered changes found.

Coverage Regressions

2 previously-covered lines in 1 file lost coverage.

File Lines Losing Coverage Coverage
operator/internal/controller/skyhook_controller.go 2 83.84%

Coverage Stats

Coverage Status
Relevant Lines: 9925
Covered Lines: 8107
Line Coverage: 81.68%
Coverage Strength: 9.29 hits per line

💛 - Coveralls

rice-riley
rice-riley previously approved these changes Jun 25, 2026
@lockwobr

Copy link
Copy Markdown
Collaborator Author

Addressed both CodeRabbit findings (force-pushed):

  1. docs/release-process.md stale follow-up — fixed, but not exactly as suggested. The suggestion replaced the "kube-rbac-proxy, operator, and agent" list with the maintenance-job fields, which would have dropped the other three images that also get digest-pinned. Instead I added the maintenance kubectl image to the list: ... for the kube-rbac-proxy, operator, agent, and maintenance-job kubectl (webhook.removalImage/removalTag/removalDigest) images:.

  2. Test didn't assert the digest pin — fixed. Added "(contains($stdout, '@sha256:'))": true to the selector-migration-job-renders image check so an unpinned alpine/kubectl:<tag> regression now fails the test. Verified the template test still passes (1 test, 0 failures).

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@chart/templates/selector-migration-job.yaml`:
- Around line 83-94: The selector migration job’s kubectl execution is running
with a read-only root filesystem, so it needs an explicit writable cache
location. Update the selector migration hook/template by adding a writable
emptyDir volume and mounting it in the same pod spec used by kubectl, then
configure kubectl to use that path via KUBECACHEDIR or the --cache-dir option so
discovery caching works without permission errors.

In `@k8s-tests/chainsaw/helm/helm-template-test/chainsaw-test.yaml`:
- Around line 145-174: The selector-migration-job-renders test only checks the
Job shape, so it can miss regressions in the selector payload. Update the
chainsaw assertion for the selector-migration job to verify the rendered command
in the job container uses the full managerSelectorLabels set, not just
control-plane=controller-manager. Focus on the selector-migration-job template
and its migration container command so the rendered output includes
control-plane, app.kubernetes.io/name, and app.kubernetes.io/instance.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 9c696aed-a479-4678-b2b5-af4df15a0fd5

📥 Commits

Reviewing files that changed from the base of the PR and between e188e55 and ea75b77.

📒 Files selected for processing (13)
  • ROADMAP.md
  • chart/RELEASE_NOTES.md
  • chart/templates/_helpers.tpl
  • chart/templates/cleanup-skyhooks-job.yaml
  • chart/templates/cleanup-webhook-job.yaml
  • chart/templates/deployment.yaml
  • chart/templates/selector-migration-job.yaml
  • chart/templates/selector-migration-role.yaml
  • chart/templates/selector-migration-rolebinding.yaml
  • chart/templates/selector-migration-serviceaccount.yaml
  • chart/values.yaml
  • docs/release-process.md
  • k8s-tests/chainsaw/helm/helm-template-test/chainsaw-test.yaml

Comment thread chart/templates/selector-migration-job.yaml
Comment thread k8s-tests/chainsaw/helm/helm-template-test/chainsaw-test.yaml
@lockwobr lockwobr force-pushed the fix/chart-selector-migration branch from ea75b77 to 1836257 Compare June 26, 2026 00:09
@lockwobr

Copy link
Copy Markdown
Collaborator Author

Addressed both findings from the latest review (force-pushed):

  1. Test now asserts the selector payload, not just the Job shape (Major). Added checks that the rendered hook command compares all three desired labels, so a regression to comparing only control-plane=controller-manager fails the test:

    "(contains($stdout, 'selector label control-plane differs'))": true
    "(contains($stdout, 'selector label app.kubernetes.io/name differs'))": true
    "(contains($stdout, 'selector label app.kubernetes.io/instance differs'))": true
    
  2. Writable kubectl cache under readOnlyRootFilesystem. Added a kube-cache emptyDir mounted at /tmp/kubecache with KUBECACHEDIR=/tmp/kubecache so kubectl can write its discovery/http cache instead of re-discovering the API on every call. Applied to the new selector-migration Job and the two existing cleanup Jobs (same read-only-fs + kubectl pattern) so all three stay consistent.

Re-validated: helm lint clean, helm-template-test passes (now including the selector-payload and digest assertions), and the live kind e2e still passes (old skyhook-operator v0.15.1 -> this chart with fullnameOverride: upgrade succeeds, selector flips to nodewright, Deployment recreated; second upgrade is a no-op with stable UID).

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@chart/templates/deployment.yaml`:
- Around line 18-24: The Deployment template is still duplicating selector
labels in the pod template instead of reusing the shared selector helper. Update
the pod template labels in chart/templates/deployment.yaml to use
chart.managerSelectorLabels the same way spec.selector already does, so the
selector and template stay in sync if the helper changes. Use the existing
Deployment.spec.selector and pod template labels section as the places to align.

In `@chart/templates/selector-migration-role.yaml`:
- Around line 13-21: The hook Role currently grants namespace-wide Deployment
access in the selector migration template. Update the rules in the
selector-migration Role manifest to scope the controller-manager access by using
resourceNames for the target Deployment in the get and delete permissions, and
remove the list verb entirely. Keep the change localized to the Role rules so
the hook token only has access to the controller-manager Deployment.

In `@chart/values.yaml`:
- Around line 159-165: The maintenance kubectl image pinned in the values for
the removal jobs is outside the supported Kubernetes skew for the chart’s
documented minimum version. Update the removalImage/removalTag/removalDigest
entry so the kubectl version matches the supported cluster range, or raise the
chart’s minimum Kubernetes version in the same release; keep the fix aligned
with the maintenance job settings referenced by removalImage and removalTag.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 89bbaea0-85a4-4ced-9422-046f7a366f08

📥 Commits

Reviewing files that changed from the base of the PR and between ea75b77 and 1836257.

📒 Files selected for processing (13)
  • ROADMAP.md
  • chart/RELEASE_NOTES.md
  • chart/templates/_helpers.tpl
  • chart/templates/cleanup-skyhooks-job.yaml
  • chart/templates/cleanup-webhook-job.yaml
  • chart/templates/deployment.yaml
  • chart/templates/selector-migration-job.yaml
  • chart/templates/selector-migration-role.yaml
  • chart/templates/selector-migration-rolebinding.yaml
  • chart/templates/selector-migration-serviceaccount.yaml
  • chart/values.yaml
  • docs/release-process.md
  • k8s-tests/chainsaw/helm/helm-template-test/chainsaw-test.yaml

Comment thread chart/templates/deployment.yaml
Comment thread chart/templates/selector-migration-role.yaml
Comment thread chart/values.yaml
@lockwobr lockwobr force-pushed the fix/chart-selector-migration branch from 1836257 to ffd9375 Compare June 26, 2026 16:06

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/release-process.md`:
- Line 286: The release-process snippet uses the wrong Docker command name, so
update the prerequisite and the example to use docker buildx instead of
docker-buildx. Make the change in the kubectl image inspection command so the
documented command matches the standard Docker CLI and is consistent with the
rest of the release-process instructions.

In `@k8s-tests/chainsaw/helm/helm-template-test/chainsaw-test.yaml`:
- Around line 228-247: The inline least-privilege comment is outdated because
the rendered Role in the helm template test only grants get and delete, not
list. Update the comment near the helm template assertion in chainsaw-test.yaml
to match the actual permissions enforced by the selector-migration Role, using
the same nodewright-selector-migration and selector-migration-role.yaml context
to keep it accurate.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: cfbacf62-9df7-4ea2-8cd7-35d35860da3a

📥 Commits

Reviewing files that changed from the base of the PR and between 1836257 and ffd9375.

📒 Files selected for processing (13)
  • ROADMAP.md
  • chart/RELEASE_NOTES.md
  • chart/templates/_helpers.tpl
  • chart/templates/cleanup-skyhooks-job.yaml
  • chart/templates/cleanup-webhook-job.yaml
  • chart/templates/deployment.yaml
  • chart/templates/selector-migration-job.yaml
  • chart/templates/selector-migration-role.yaml
  • chart/templates/selector-migration-rolebinding.yaml
  • chart/templates/selector-migration-serviceaccount.yaml
  • chart/values.yaml
  • docs/release-process.md
  • k8s-tests/chainsaw/helm/helm-template-test/chainsaw-test.yaml

Comment thread docs/release-process.md Outdated
Comment thread k8s-tests/chainsaw/helm/helm-template-test/chainsaw-test.yaml Outdated
@lockwobr lockwobr force-pushed the fix/chart-selector-migration branch from ffd9375 to bb5cb29 Compare June 26, 2026 16:23

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/release-process.md`:
- Line 299: The kubectl image reference in the release-process guidance is
pinned outside the supported client/server skew for the documented minimum
Kubernetes version. Update the kubectl tag used for the short-lived maintenance
jobs to a version within one minor of the oldest supported cluster, or revise
the documented minimum Kubernetes version so it matches the pinned kubectl
version; keep the change aligned with the existing kubectl guidance in the
release-process document.

In `@k8s-tests/chainsaw/helm/helm-template-test/chainsaw-test.yaml`:
- Around line 145-196: The selector-migration job template test is missing
coverage for the writable kubectl cache setup added to support a read-only root
filesystem. Update the assertions in the selector-migration-job render test to
verify the Job spec from selector-migration-job.yaml includes the
KUBECACHEDIR=/tmp/kubecache environment setting and the writable emptyDir volume
mount for the kubectl cache, alongside the existing image and selector checks.
Focus on the rendered Job manifest and its container spec so regressions in the
cache contract are caught.
- Around line 119-196: The current helm template tests only cover default chart
values, so they miss the rename/override upgrade path that breaks immutable
selectors. Update the chainsaw cases in the deployment and selector-migration
job tests to render with a nameOverride/fullnameOverride scenario that preserves
the controller-manager identity, and assert the Deployment selector and
selector-migration hook still match there. Use the existing test steps around
deployment-selector-pinned-labels and selector-migration-job-renders to add the
override-based render and corresponding expectations.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 4e8ab392-94b4-416b-929b-40e25d85c329

📥 Commits

Reviewing files that changed from the base of the PR and between ffd9375 and bb5cb29.

📒 Files selected for processing (13)
  • ROADMAP.md
  • chart/RELEASE_NOTES.md
  • chart/templates/_helpers.tpl
  • chart/templates/cleanup-skyhooks-job.yaml
  • chart/templates/cleanup-webhook-job.yaml
  • chart/templates/deployment.yaml
  • chart/templates/selector-migration-job.yaml
  • chart/templates/selector-migration-role.yaml
  • chart/templates/selector-migration-rolebinding.yaml
  • chart/templates/selector-migration-serviceaccount.yaml
  • chart/values.yaml
  • docs/release-process.md
  • k8s-tests/chainsaw/helm/helm-template-test/chainsaw-test.yaml

Comment thread docs/release-process.md
Comment thread k8s-tests/chainsaw/helm/helm-template-test/chainsaw-test.yaml
Comment thread k8s-tests/chainsaw/helm/helm-template-test/chainsaw-test.yaml
@lockwobr

Copy link
Copy Markdown
Collaborator Author

Note on the unit-tests failure: it's an unrelated pre-existing flake, not caused by this PR (which touches only chart/, docs/, k8s-tests/chainsaw/, ROADMAP.md -- zero Go code). The failing spec is HandleConfigUpdates config sync gate ... does not short-circuit the reconcile when a config sync is only pending, failing on metadata.ownerReferences.uid: Invalid value: "": must not be empty -- an envtest fixture race where the owner UID isn't populated before the ConfigMap is created. The force-push re-runs CI, giving it a fresh execution.

…ht upgrade

helm upgrade fails with "spec.selector: field is immutable" when a release first
installed from the pre-rename skyhook-operator chart preserves the controller-
manager Deployment name across the rename (via fullnameOverride/nameOverride, as
AICR does). The selector's app.kubernetes.io/name is sourced from the chart name,
which changed skyhook-operator -> nodewright, and spec.selector is immutable.

- source the controller-manager selector from a shared chart.managerSelectorLabels
  helper used by both the Deployment and the new migration hook
- add a smart pre-upgrade hook (Job + least-privilege ServiceAccount/Role/
  RoleBinding, gated by selectorMigration.enabled, default on) that deletes the
  Deployment only when its live selector does not match the desired one, so helm
  recreates it; verified no-op (no downtime) on normal upgrades
- migrate the maintenance-job image off the now-paywalled bitnami/kubectl to
  alpine/kubectl:1.36.2 pinned by multi-arch digest (closes #207)
- add helm-template chainsaw asserts for the selector pin and each hook resource
- docs: add chart RELEASE_NOTES, update release-process image example and ROADMAP

Fixes #285
Closes #207

Signed-off-by: Brian Lockwood <lockwobr@gmail.com>
@lockwobr lockwobr force-pushed the fix/chart-selector-migration branch from 9bdcec0 to 2dc48bd Compare June 26, 2026 16:54
@lockwobr lockwobr enabled auto-merge (squash) June 26, 2026 17:20

@rice-riley rice-riley left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lockwobr lockwobr merged commit 1d00f74 into main Jun 26, 2026
38 checks passed
@lockwobr lockwobr deleted the fix/chart-selector-migration branch June 26, 2026 18:35
lockwobr added a commit that referenced this pull request Jun 26, 2026
…ht upgrade (#286) (#289)

helm upgrade fails with "spec.selector: field is immutable" when a release first
installed from the pre-rename skyhook-operator chart preserves the controller-
manager Deployment name across the rename (via fullnameOverride/nameOverride, as
AICR does). The selector's app.kubernetes.io/name is sourced from the chart name,
which changed skyhook-operator -> nodewright, and spec.selector is immutable.

- source the controller-manager selector from a shared chart.managerSelectorLabels
  helper used by both the Deployment and the new migration hook
- add a smart pre-upgrade hook (Job + least-privilege ServiceAccount/Role/
  RoleBinding, gated by selectorMigration.enabled, default on) that deletes the
  Deployment only when its live selector does not match the desired one, so helm
  recreates it; verified no-op (no downtime) on normal upgrades
- migrate the maintenance-job image off the now-paywalled bitnami/kubectl to
  alpine/kubectl:1.36.2 pinned by multi-arch digest (closes #207)
- add helm-template chainsaw asserts for the selector pin and each hook resource
- docs: add chart RELEASE_NOTES, update release-process image example and ROADMAP

Fixes #285
Closes #207

Signed-off-by: Brian Lockwood <lockwobr@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component/chart Helm chart component/ci CI workflows, GitHub Actions, and repo tooling component/operator Skyhook operator (controller-manager) component/tests End-to-end / chainsaw test suites (k8s-tests) doc Documentation change (PR path label; doc issues use the Documentation type)

Projects

None yet

3 participants