Skip to content

🌱 Ensure COS phase immutability for referenced object approach#2635

Open
pedjak wants to merge 2 commits intooperator-framework:mainfrom
pedjak:secret-verify
Open

🌱 Ensure COS phase immutability for referenced object approach#2635
pedjak wants to merge 2 commits intooperator-framework:mainfrom
pedjak:secret-verify

Conversation

@pedjak
Copy link
Copy Markdown
Contributor

@pedjak pedjak commented Apr 8, 2026

Description

ClusterObjectSet phases are immutable by design, but when objects are stored in
external Secrets via refs, the Secret content could be changed by deleting and
recreating the Secret with the same name. This enforces phase immutability for
the referenced object approach by:

  • Verifying that referenced Secrets have immutable: true set
  • Computing a per-phase SHA-256 content digest after successful phase resolution
    and recording it in .status.observedPhases
  • Blocking reconciliation (Progressing=False, Reason=Blocked) if any referenced
    Secret is mutable or any phase's resolved content digest has changed

The digest is source-agnostic — it covers the fully resolved phase content
regardless of whether objects are inline or referenced from Secrets, making it
forward-compatible with future object sources (e.g., bundle refs).

Blocked COS resources are recoverable: they can be re-reconciled when triggered
(e.g., via annotation), re-evaluating the condition.

Reviewer Checklist

  • API Go Documentation
  • Tests: Unit Tests (and E2E Tests, if appropriate)
  • Comprehensive Commit Messages
  • Links to related GitHub Issue(s)

Copilot AI review requested due to automatic review settings April 8, 2026 15:28
@netlify
Copy link
Copy Markdown

netlify bot commented Apr 8, 2026

Deploy Preview for olmv1 ready!

Name Link
🔨 Latest commit b95d9ab
🔍 Latest deploy log https://app.netlify.com/projects/olmv1/deploys/69d7c12c43c6950008002ba2
😎 Deploy Preview https://deploy-preview-2635--olmv1.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Apr 8, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign kevinrizza for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot requested review from joelanford and oceanc80 April 8, 2026 15:28
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Enforces ClusterObjectSet phase immutability when objects are sourced via referenced Secrets by requiring referenced Secrets to be immutable and by detecting Secret content changes using recorded hashes.

Changes:

  • Add Secret verification in the COS reconciler (immutable requirement + SHA-256 hash comparison) and block reconciliation when verification fails.
  • Persist referenced Secret hashes in .status.observedObjectContainers and extend CRD/applyconfigurations accordingly.
  • Add/extend unit + e2e coverage and update docs to reflect the new behavior.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
test/e2e/steps/steps.go Adds new e2e steps for triggering reconciliation, message fragment matching, and checking observed Secret hashes in status.
test/e2e/features/revision.feature Adds e2e scenarios for “mutable Secret” and “recreated Secret with changed content” blocking behavior.
manifests/experimental.yaml Extends CRD schema with .status.observedObjectContainers.
manifests/experimental-e2e.yaml Extends e2e CRD schema with .status.observedObjectContainers.
internal/operator-controller/controllers/resolve_ref_test.go Updates ref-resolution tests to use immutable Secrets.
internal/operator-controller/controllers/clusterobjectset_controller_internal_test.go Adds unit tests for Secret hashing and referenced-Secret verification.
internal/operator-controller/controllers/clusterobjectset_controller.go Implements referenced Secret verification + hashing and blocks reconciliation on violations.
helm/olmv1/base/operator-controller/crd/experimental/olm.operatorframework.io_clusterobjectsets.yaml Mirrors CRD schema changes for Helm packaging.
docs/draft/concepts/large-bundle-support.md Updates documented behavior/conventions for referenced Secrets (immutability + hash enforcement).
applyconfigurations/utils.go Registers apply-configuration kind for ObservedObjectContainer.
applyconfigurations/api/v1/observedobjectcontainer.go Adds generated apply configuration for the new status type.
applyconfigurations/api/v1/clusterobjectsetstatus.go Adds apply support for .status.observedObjectContainers.
api/v1/zz_generated.deepcopy.go Adds deepcopy support for ObservedObjectContainer and status field.
api/v1/clusterobjectset_types.go Introduces ObservedObjectContainer API type and status field.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +605 to +614
func ClusterObjectSetReportsConditionWithMessageFragment(ctx context.Context, revisionName, conditionType, conditionStatus, conditionReason string, msgFragment *godog.DocString) error {
msgCmp := alwaysMatch
if msgFragment != nil {
expectedMsgFragment := substituteScenarioVars(strings.Join(strings.Fields(msgFragment.Content), " "), scenarioCtx(ctx))
msgCmp = func(actualMsg string) bool {
return strings.Contains(actualMsg, expectedMsgFragment)
}
}
return waitForCondition(ctx, "clusterobjectset", substituteScenarioVars(revisionName, scenarioCtx(ctx)), conditionType, conditionStatus, &conditionReason, msgCmp)
}
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This step normalizes whitespace in the expected fragment but not in the actual condition message; if the controller formats messages with newlines/multiple spaces (common when wrapping), strings.Contains may fail unexpectedly. Consider normalizing actualMsg with the same strings.Fields + Join approach before Contains to reduce e2e flakiness while still matching by fragment.

Copilot uses AI. Check for mistakes.
@pedjak pedjak changed the title ✨ Ensure COS phase immutability for referenced object approach 🌱 Ensure COS phase immutability for referenced object approach Apr 8, 2026
Copilot AI review requested due to automatic review settings April 8, 2026 15:46
@pedjak pedjak requested review from camilamacedo86 and dtfranz and removed request for oceanc80 April 8, 2026 15:54
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

internal/operator-controller/controllers/clusterobjectset_controller.go:1

  • Treating Progressing=False, Reason=Blocked as terminal at the predicate level can prevent the controller from ever reconciling again, even if the user fixes the root cause (e.g., recreates the Secret as immutable, or changes COS spec/generation). If “Blocked” is intended to be recoverable, remove it from this predicate or gate suppression more narrowly (e.g., only suppress when generation hasn’t changed, while still allowing spec updates / explicit triggers to enqueue reconciles).
//go:build !standard

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 8, 2026

Codecov Report

❌ Patch coverage is 68.08511% with 30 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.93%. Comparing base (8bd971b) to head (b95d9ab).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
api/v1/zz_generated.deepcopy.go 33.33% 8 Missing ⚠️
applyconfigurations/api/v1/observedphase.go 0.00% 8 Missing ⚠️
...plyconfigurations/api/v1/clusterobjectsetstatus.go 0.00% 6 Missing ⚠️
...troller/controllers/clusterobjectset_controller.go 90.90% 4 Missing and 2 partials ⚠️
applyconfigurations/utils.go 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2635      +/-   ##
==========================================
- Coverage   68.95%   68.93%   -0.03%     
==========================================
  Files         139      140       +1     
  Lines        9891     9981      +90     
==========================================
+ Hits         6820     6880      +60     
- Misses       2562     2590      +28     
- Partials      509      511       +2     
Flag Coverage Δ
e2e 37.05% <0.00%> (-0.55%) ⬇️
experimental-e2e 52.38% <66.30%> (+0.15%) ⬆️
unit 53.59% <50.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

// +listType=map
// +listMapKey=name
// +optional
ObservedObjectContainers []ObservedObjectContainer `json:"observedObjectContainers,omitempty"`
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of requiring the individual objects to never change (and keeping track of per-object hashes), WDYT about computing a single hash of the resulting phases content. If that remains stable, do we care if the underlying organization of the secrets changed?

If we do that, it also means we abstract away any details of how we ultimately end up with those phases (i.e. we don't care if it is inline vs from secrets vs from a bundle ref, etc.), which would be more futureproof.

Copy link
Copy Markdown
Contributor Author

@pedjak pedjak Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do that, it also means we abstract away any details of how we ultimately end up with those phases (i.e. we don't care if it is inline vs from secrets vs from a bundle ref, etc.), which would be more futureproof.

interesting idea! implemented, ptal.


func (c *ClusterObjectSetReconciler) SetupWithManager(mgr ctrl.Manager) error {
skipProgressDeadlineExceededPredicate := predicate.Funcs{
skipTerminallyBlockedPredicate := predicate.Funcs{
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A terminally blocked COS could become unblocked if the user puts back the expected content though, right? We should continue reconciling to check if the expected content is back in place.

ClusterObjectSet phases are immutable by design, but when objects are
stored in external Secrets via refs, the Secret content could be changed
by deleting and recreating the Secret with the same name. This enforces
phase immutability for the referenced object approach by verifying that
referenced Secrets are marked immutable and that their content hashes
have not changed since first resolution.

On first successful reconciliation, SHA-256 hashes of referenced Secret
data are recorded in .status.observedObjectContainers. On subsequent
reconciles, the hashes are compared and reconciliation is blocked with
Progressing=False/Reason=Blocked if any mismatch is detected.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@pedjak pedjak requested a review from joelanford April 9, 2026 07:56
// object content at first successful reconciliation.
//
// +required
Hash string `json:"hash"`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By creating the resolution digest we still needing this one: https://github.com/operator-framework/operator-controller/pull/2610/changes#diff-c804d15cab34d4c7c30fcd68d40d0269084c11db0736d844bd6ee8f60262a924R68

If so, could we change the name PhaseDigest?

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +161 to +169
currentPhases := make([]ocv1.ObservedPhase, 0, len(phases))
for _, phase := range phases {
hash, err := computePhaseHash(phase)
if err != nil {
setRetryingConditions(cos, err.Error())
return ctrl.Result{}, fmt.Errorf("computing phase hash: %v", err)
}
currentPhases = append(currentPhases, ocv1.ObservedPhase{Name: phase.GetName(), Hash: hash})
}
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The phase hash is computed from the boxcutter phases returned by buildBoxcutterPhases(), after the reconciler mutates each resolved object (e.g., injecting labels.OwnerNameKey). This couples the stored .status.observedPhases hashes to controller-internal mutations and COS metadata, so future controller changes (or label changes) could falsely appear as “referenced content changed” and permanently block reconciliation. Consider hashing the pre-mutation resolved manifests (or normalizing/stripping controller-added fields) so the hash reflects only the referenced object content you’re trying to make immutable.

Copilot uses AI. Check for mistakes.
Comment on lines +171 to +176
if len(cos.Status.ObservedPhases) == 0 {
cos.Status.ObservedPhases = currentPhases
} else if err := verifyObservedPhases(cos.Status.ObservedPhases, currentPhases); err != nil {
l.Error(err, "resolved phases content changed, blocking reconciliation")
markAsNotProgressing(cos, ocv1.ClusterObjectSetReasonBlocked, err.Error())
return ctrl.Result{}, nil
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ObservedPhases is currently persisted as soon as buildBoxcutterPhases succeeds (i.e., refs resolve), even if the subsequent revision reconcile later fails or is still rolling out. This seems to conflict with the field/docs wording (“first successful reconciliation”) and can lock in hashes before a COS has actually succeeded. Either update the wording to reflect “first successful resolution/build” or only set ObservedPhases once the reconcile has reached a success terminal state.

Suggested change
if len(cos.Status.ObservedPhases) == 0 {
cos.Status.ObservedPhases = currentPhases
} else if err := verifyObservedPhases(cos.Status.ObservedPhases, currentPhases); err != nil {
l.Error(err, "resolved phases content changed, blocking reconciliation")
markAsNotProgressing(cos, ocv1.ClusterObjectSetReasonBlocked, err.Error())
return ctrl.Result{}, nil
if len(cos.Status.ObservedPhases) != 0 {
if err := verifyObservedPhases(cos.Status.ObservedPhases, currentPhases); err != nil {
l.Error(err, "resolved phases content changed, blocking reconciliation")
markAsNotProgressing(cos, ocv1.ClusterObjectSetReasonBlocked, err.Error())
return ctrl.Result{}, nil
}

Copilot uses AI. Check for mistakes.
if err != nil {
return "", fmt.Errorf("marshaling object in phase %q: %w", phase.GetName(), err)
}
h.Write(data)
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

computePhaseHash writes the JSON of each object back-to-back with no delimiter/length prefix. While SHA-256 collisions are infeasible, concatenation without boundaries can still lead to ambiguous encodings (different object splits producing the same byte stream). Add an explicit separator (e.g., '\n' or a length prefix) between objects to make the input unambiguous.

Suggested change
h.Write(data)
h.Write(data)
h.Write([]byte{'\n'})

Copilot uses AI. Check for mistakes.
Comment on lines +789 to +802
func verifyObservedPhases(stored, current []ocv1.ObservedPhase) error {
storedMap := make(map[string]string, len(stored))
for _, s := range stored {
storedMap[s.Name] = s.Hash
}
for _, c := range current {
if prev, ok := storedMap[c.Name]; ok && prev != c.Hash {
return fmt.Errorf(
"resolved content of phase %q has changed (expected hash %s, got %s): "+
"a referenced object source may have been deleted and recreated with different content",
c.Name, prev, c.Hash)
}
}
return nil
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verifyObservedPhases only detects mismatches for phases that exist in current; it does not fail if a phase recorded in stored is missing from current. That allows a stored phase to “disappear” without being treated as a content change. Consider also iterating over stored to ensure every stored phase exists in current (and matches), and/or explicitly error when a stored phase is absent.

Copilot uses AI. Check for mistakes.
Comment on lines 372 to 398
@@ -354,8 +380,10 @@ func (c *ClusterObjectSetReconciler) SetupWithManager(mgr ctrl.Manager) error {
if !rev.DeletionTimestamp.IsZero() {
return true
}
if cnd := meta.FindStatusCondition(rev.Status.Conditions, ocv1.ClusterObjectSetTypeProgressing); cnd != nil && cnd.Status == metav1.ConditionFalse && cnd.Reason == ocv1.ReasonProgressDeadlineExceeded {
return false
if cnd := meta.FindStatusCondition(rev.Status.Conditions, ocv1.ClusterObjectSetTypeProgressing); cnd != nil && cnd.Status == metav1.ConditionFalse {
if cnd.Reason == ocv1.ReasonProgressDeadlineExceeded {
return false
}
}
return true
},
@@ -366,7 +394,7 @@ func (c *ClusterObjectSetReconciler) SetupWithManager(mgr ctrl.Manager) error {
&ocv1.ClusterObjectSet{},
builder.WithPredicates(
predicate.ResourceVersionChangedPredicate{},
skipProgressDeadlineExceededPredicate,
skipTerminallyBlockedPredicate,
),
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The predicate variable is renamed to skipTerminallyBlockedPredicate, but the logic only skips reconciles for Progressing=False with reason ProgressDeadlineExceeded (it does not skip Reason=Blocked). This name is misleading and makes the filtering behavior harder to understand/maintain. Either rename it to reflect what it actually does, or extend the predicate to also cover the intended terminally-blocked reasons (if that’s the goal).

Copilot uses AI. Check for mistakes.
Comment on lines +514 to +523
// observedPhases records the content hashes of resolved phases
// at first successful reconciliation. This is used to detect if
// referenced object sources were deleted and recreated with
// different content. Each entry covers all fully-resolved object
// manifests within a phase, making it source-agnostic.
//
// +listType=map
// +listMapKey=name
// +optional
ObservedPhases []ObservedPhase `json:"observedPhases,omitempty"`
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description mentions recording hashes in .status.observedObjectContainers, but the API/CRD changes add .status.observedPhases. Please align the PR description (and any related docs) with the actual field name to avoid confusing API consumers.

Copilot uses AI. Check for mistakes.
Addresses PR review feedback:
- Compute SHA-256 hash per resolved phase (name + objects) instead of
  per-Secret, making immutability verification source-agnostic and
  futureproof for bundle refs.
- Move hash computation after buildBoxcutterPhases succeeds so hashes
  are only persisted for successfully resolved content.
- Remove Blocked reason from skipTerminallyBlockedPredicate so blocked
  COS can be re-reconciled when the underlying issue is fixed.

API: rename ObservedObjectContainer → ObservedPhase, field
observedObjectContainers → observedPhases in ClusterObjectSetStatus.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants