🌱 Ensure COS phase immutability for referenced object approach#2635
🌱 Ensure COS phase immutability for referenced object approach#2635pedjak wants to merge 2 commits intooperator-framework:mainfrom
Conversation
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Enforces ClusterObjectSet phase immutability when objects are sourced via referenced Secrets by requiring referenced Secrets to be immutable and by detecting Secret content changes using recorded hashes.
Changes:
- Add Secret verification in the COS reconciler (immutable requirement + SHA-256 hash comparison) and block reconciliation when verification fails.
- Persist referenced Secret hashes in
.status.observedObjectContainersand extend CRD/applyconfigurations accordingly. - Add/extend unit + e2e coverage and update docs to reflect the new behavior.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| test/e2e/steps/steps.go | Adds new e2e steps for triggering reconciliation, message fragment matching, and checking observed Secret hashes in status. |
| test/e2e/features/revision.feature | Adds e2e scenarios for “mutable Secret” and “recreated Secret with changed content” blocking behavior. |
| manifests/experimental.yaml | Extends CRD schema with .status.observedObjectContainers. |
| manifests/experimental-e2e.yaml | Extends e2e CRD schema with .status.observedObjectContainers. |
| internal/operator-controller/controllers/resolve_ref_test.go | Updates ref-resolution tests to use immutable Secrets. |
| internal/operator-controller/controllers/clusterobjectset_controller_internal_test.go | Adds unit tests for Secret hashing and referenced-Secret verification. |
| internal/operator-controller/controllers/clusterobjectset_controller.go | Implements referenced Secret verification + hashing and blocks reconciliation on violations. |
| helm/olmv1/base/operator-controller/crd/experimental/olm.operatorframework.io_clusterobjectsets.yaml | Mirrors CRD schema changes for Helm packaging. |
| docs/draft/concepts/large-bundle-support.md | Updates documented behavior/conventions for referenced Secrets (immutability + hash enforcement). |
| applyconfigurations/utils.go | Registers apply-configuration kind for ObservedObjectContainer. |
| applyconfigurations/api/v1/observedobjectcontainer.go | Adds generated apply configuration for the new status type. |
| applyconfigurations/api/v1/clusterobjectsetstatus.go | Adds apply support for .status.observedObjectContainers. |
| api/v1/zz_generated.deepcopy.go | Adds deepcopy support for ObservedObjectContainer and status field. |
| api/v1/clusterobjectset_types.go | Introduces ObservedObjectContainer API type and status field. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
internal/operator-controller/controllers/clusterobjectset_controller.go
Outdated
Show resolved
Hide resolved
internal/operator-controller/controllers/clusterobjectset_controller.go
Outdated
Show resolved
Hide resolved
| func ClusterObjectSetReportsConditionWithMessageFragment(ctx context.Context, revisionName, conditionType, conditionStatus, conditionReason string, msgFragment *godog.DocString) error { | ||
| msgCmp := alwaysMatch | ||
| if msgFragment != nil { | ||
| expectedMsgFragment := substituteScenarioVars(strings.Join(strings.Fields(msgFragment.Content), " "), scenarioCtx(ctx)) | ||
| msgCmp = func(actualMsg string) bool { | ||
| return strings.Contains(actualMsg, expectedMsgFragment) | ||
| } | ||
| } | ||
| return waitForCondition(ctx, "clusterobjectset", substituteScenarioVars(revisionName, scenarioCtx(ctx)), conditionType, conditionStatus, &conditionReason, msgCmp) | ||
| } |
There was a problem hiding this comment.
This step normalizes whitespace in the expected fragment but not in the actual condition message; if the controller formats messages with newlines/multiple spaces (common when wrapping), strings.Contains may fail unexpectedly. Consider normalizing actualMsg with the same strings.Fields + Join approach before Contains to reduce e2e flakiness while still matching by fragment.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
internal/operator-controller/controllers/clusterobjectset_controller.go:1
- Treating
Progressing=False, Reason=Blockedas terminal at the predicate level can prevent the controller from ever reconciling again, even if the user fixes the root cause (e.g., recreates the Secret as immutable, or changes COS spec/generation). If “Blocked” is intended to be recoverable, remove it from this predicate or gate suppression more narrowly (e.g., only suppress when generation hasn’t changed, while still allowing spec updates / explicit triggers to enqueue reconciles).
//go:build !standard
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
internal/operator-controller/controllers/clusterobjectset_controller.go
Outdated
Show resolved
Hide resolved
internal/operator-controller/controllers/clusterobjectset_controller.go
Outdated
Show resolved
Hide resolved
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2635 +/- ##
==========================================
- Coverage 68.95% 68.93% -0.03%
==========================================
Files 139 140 +1
Lines 9891 9981 +90
==========================================
+ Hits 6820 6880 +60
- Misses 2562 2590 +28
- Partials 509 511 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
api/v1/clusterobjectset_types.go
Outdated
| // +listType=map | ||
| // +listMapKey=name | ||
| // +optional | ||
| ObservedObjectContainers []ObservedObjectContainer `json:"observedObjectContainers,omitempty"` |
There was a problem hiding this comment.
Instead of requiring the individual objects to never change (and keeping track of per-object hashes), WDYT about computing a single hash of the resulting phases content. If that remains stable, do we care if the underlying organization of the secrets changed?
If we do that, it also means we abstract away any details of how we ultimately end up with those phases (i.e. we don't care if it is inline vs from secrets vs from a bundle ref, etc.), which would be more futureproof.
There was a problem hiding this comment.
If we do that, it also means we abstract away any details of how we ultimately end up with those phases (i.e. we don't care if it is inline vs from secrets vs from a bundle ref, etc.), which would be more futureproof.
interesting idea! implemented, ptal.
|
|
||
| func (c *ClusterObjectSetReconciler) SetupWithManager(mgr ctrl.Manager) error { | ||
| skipProgressDeadlineExceededPredicate := predicate.Funcs{ | ||
| skipTerminallyBlockedPredicate := predicate.Funcs{ |
There was a problem hiding this comment.
A terminally blocked COS could become unblocked if the user puts back the expected content though, right? We should continue reconciling to check if the expected content is back in place.
ClusterObjectSet phases are immutable by design, but when objects are stored in external Secrets via refs, the Secret content could be changed by deleting and recreating the Secret with the same name. This enforces phase immutability for the referenced object approach by verifying that referenced Secrets are marked immutable and that their content hashes have not changed since first resolution. On first successful reconciliation, SHA-256 hashes of referenced Secret data are recorded in .status.observedObjectContainers. On subsequent reconciles, the hashes are compared and reconciliation is blocked with Progressing=False/Reason=Blocked if any mismatch is detected. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
api/v1/clusterobjectset_types.go
Outdated
| // object content at first successful reconciliation. | ||
| // | ||
| // +required | ||
| Hash string `json:"hash"` |
There was a problem hiding this comment.
By creating the resolution digest we still needing this one: https://github.com/operator-framework/operator-controller/pull/2610/changes#diff-c804d15cab34d4c7c30fcd68d40d0269084c11db0736d844bd6ee8f60262a924R68
If so, could we change the name PhaseDigest?
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| currentPhases := make([]ocv1.ObservedPhase, 0, len(phases)) | ||
| for _, phase := range phases { | ||
| hash, err := computePhaseHash(phase) | ||
| if err != nil { | ||
| setRetryingConditions(cos, err.Error()) | ||
| return ctrl.Result{}, fmt.Errorf("computing phase hash: %v", err) | ||
| } | ||
| currentPhases = append(currentPhases, ocv1.ObservedPhase{Name: phase.GetName(), Hash: hash}) | ||
| } |
There was a problem hiding this comment.
The phase hash is computed from the boxcutter phases returned by buildBoxcutterPhases(), after the reconciler mutates each resolved object (e.g., injecting labels.OwnerNameKey). This couples the stored .status.observedPhases hashes to controller-internal mutations and COS metadata, so future controller changes (or label changes) could falsely appear as “referenced content changed” and permanently block reconciliation. Consider hashing the pre-mutation resolved manifests (or normalizing/stripping controller-added fields) so the hash reflects only the referenced object content you’re trying to make immutable.
| if len(cos.Status.ObservedPhases) == 0 { | ||
| cos.Status.ObservedPhases = currentPhases | ||
| } else if err := verifyObservedPhases(cos.Status.ObservedPhases, currentPhases); err != nil { | ||
| l.Error(err, "resolved phases content changed, blocking reconciliation") | ||
| markAsNotProgressing(cos, ocv1.ClusterObjectSetReasonBlocked, err.Error()) | ||
| return ctrl.Result{}, nil |
There was a problem hiding this comment.
ObservedPhases is currently persisted as soon as buildBoxcutterPhases succeeds (i.e., refs resolve), even if the subsequent revision reconcile later fails or is still rolling out. This seems to conflict with the field/docs wording (“first successful reconciliation”) and can lock in hashes before a COS has actually succeeded. Either update the wording to reflect “first successful resolution/build” or only set ObservedPhases once the reconcile has reached a success terminal state.
| if len(cos.Status.ObservedPhases) == 0 { | |
| cos.Status.ObservedPhases = currentPhases | |
| } else if err := verifyObservedPhases(cos.Status.ObservedPhases, currentPhases); err != nil { | |
| l.Error(err, "resolved phases content changed, blocking reconciliation") | |
| markAsNotProgressing(cos, ocv1.ClusterObjectSetReasonBlocked, err.Error()) | |
| return ctrl.Result{}, nil | |
| if len(cos.Status.ObservedPhases) != 0 { | |
| if err := verifyObservedPhases(cos.Status.ObservedPhases, currentPhases); err != nil { | |
| l.Error(err, "resolved phases content changed, blocking reconciliation") | |
| markAsNotProgressing(cos, ocv1.ClusterObjectSetReasonBlocked, err.Error()) | |
| return ctrl.Result{}, nil | |
| } |
| if err != nil { | ||
| return "", fmt.Errorf("marshaling object in phase %q: %w", phase.GetName(), err) | ||
| } | ||
| h.Write(data) |
There was a problem hiding this comment.
computePhaseHash writes the JSON of each object back-to-back with no delimiter/length prefix. While SHA-256 collisions are infeasible, concatenation without boundaries can still lead to ambiguous encodings (different object splits producing the same byte stream). Add an explicit separator (e.g., '\n' or a length prefix) between objects to make the input unambiguous.
| h.Write(data) | |
| h.Write(data) | |
| h.Write([]byte{'\n'}) |
| func verifyObservedPhases(stored, current []ocv1.ObservedPhase) error { | ||
| storedMap := make(map[string]string, len(stored)) | ||
| for _, s := range stored { | ||
| storedMap[s.Name] = s.Hash | ||
| } | ||
| for _, c := range current { | ||
| if prev, ok := storedMap[c.Name]; ok && prev != c.Hash { | ||
| return fmt.Errorf( | ||
| "resolved content of phase %q has changed (expected hash %s, got %s): "+ | ||
| "a referenced object source may have been deleted and recreated with different content", | ||
| c.Name, prev, c.Hash) | ||
| } | ||
| } | ||
| return nil |
There was a problem hiding this comment.
verifyObservedPhases only detects mismatches for phases that exist in current; it does not fail if a phase recorded in stored is missing from current. That allows a stored phase to “disappear” without being treated as a content change. Consider also iterating over stored to ensure every stored phase exists in current (and matches), and/or explicitly error when a stored phase is absent.
| @@ -354,8 +380,10 @@ func (c *ClusterObjectSetReconciler) SetupWithManager(mgr ctrl.Manager) error { | |||
| if !rev.DeletionTimestamp.IsZero() { | |||
| return true | |||
| } | |||
| if cnd := meta.FindStatusCondition(rev.Status.Conditions, ocv1.ClusterObjectSetTypeProgressing); cnd != nil && cnd.Status == metav1.ConditionFalse && cnd.Reason == ocv1.ReasonProgressDeadlineExceeded { | |||
| return false | |||
| if cnd := meta.FindStatusCondition(rev.Status.Conditions, ocv1.ClusterObjectSetTypeProgressing); cnd != nil && cnd.Status == metav1.ConditionFalse { | |||
| if cnd.Reason == ocv1.ReasonProgressDeadlineExceeded { | |||
| return false | |||
| } | |||
| } | |||
| return true | |||
| }, | |||
| @@ -366,7 +394,7 @@ func (c *ClusterObjectSetReconciler) SetupWithManager(mgr ctrl.Manager) error { | |||
| &ocv1.ClusterObjectSet{}, | |||
| builder.WithPredicates( | |||
| predicate.ResourceVersionChangedPredicate{}, | |||
| skipProgressDeadlineExceededPredicate, | |||
| skipTerminallyBlockedPredicate, | |||
| ), | |||
There was a problem hiding this comment.
The predicate variable is renamed to skipTerminallyBlockedPredicate, but the logic only skips reconciles for Progressing=False with reason ProgressDeadlineExceeded (it does not skip Reason=Blocked). This name is misleading and makes the filtering behavior harder to understand/maintain. Either rename it to reflect what it actually does, or extend the predicate to also cover the intended terminally-blocked reasons (if that’s the goal).
| // observedPhases records the content hashes of resolved phases | ||
| // at first successful reconciliation. This is used to detect if | ||
| // referenced object sources were deleted and recreated with | ||
| // different content. Each entry covers all fully-resolved object | ||
| // manifests within a phase, making it source-agnostic. | ||
| // | ||
| // +listType=map | ||
| // +listMapKey=name | ||
| // +optional | ||
| ObservedPhases []ObservedPhase `json:"observedPhases,omitempty"` |
There was a problem hiding this comment.
PR description mentions recording hashes in .status.observedObjectContainers, but the API/CRD changes add .status.observedPhases. Please align the PR description (and any related docs) with the actual field name to avoid confusing API consumers.
Addresses PR review feedback: - Compute SHA-256 hash per resolved phase (name + objects) instead of per-Secret, making immutability verification source-agnostic and futureproof for bundle refs. - Move hash computation after buildBoxcutterPhases succeeds so hashes are only persisted for successfully resolved content. - Remove Blocked reason from skipTerminallyBlockedPredicate so blocked COS can be re-reconciled when the underlying issue is fixed. API: rename ObservedObjectContainer → ObservedPhase, field observedObjectContainers → observedPhases in ClusterObjectSetStatus. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Description
ClusterObjectSet phases are immutable by design, but when objects are stored in
external Secrets via refs, the Secret content could be changed by deleting and
recreating the Secret with the same name. This enforces phase immutability for
the referenced object approach by:
immutable: truesetand recording it in
.status.observedPhasesProgressing=False, Reason=Blocked) if any referencedSecret is mutable or any phase's resolved content digest has changed
The digest is source-agnostic — it covers the fully resolved phase content
regardless of whether objects are inline or referenced from Secrets, making it
forward-compatible with future object sources (e.g., bundle refs).
Blocked COS resources are recoverable: they can be re-reconciled when triggered
(e.g., via annotation), re-evaluating the condition.
Reviewer Checklist