Skip to content

OADP-7869: Fix stale resourceAnnotations/resourceLabels after DPA patch#2176

Open
shubham-pampattiwar wants to merge 1 commit intoopenshift:oadp-devfrom
shubham-pampattiwar:fix/oadp-7869-stale-resource-annotations
Open

OADP-7869: Fix stale resourceAnnotations/resourceLabels after DPA patch#2176
shubham-pampattiwar wants to merge 1 commit intoopenshift:oadp-devfrom
shubham-pampattiwar:fix/oadp-7869-stale-resource-annotations

Conversation

@shubham-pampattiwar
Copy link
Copy Markdown
Member

@shubham-pampattiwar shubham-pampattiwar commented Apr 22, 2026

Summary

  • Fix stale resourceAnnotations and resourceLabels not being removed from managed resources when cleared from the DPA spec
  • Introduce tracking annotations (oadp.openshift.io/managed-resource-annotation-keys, oadp.openshift.io/managed-resource-label-keys) to record which keys were applied, enabling cleanup on subsequent reconciles
  • Fix Velero deployment incorrectly merging metadata annotations into the pod template via AppendUniqueKeyTOfTMaps, aligning with the node-agent pattern of passing only PodConfig annotations to the install library

Details

Problem

When resourceAnnotations or resourceLabels were removed from the DPA spec, stale values persisted on all managed resources (Velero deployment, node-agent DaemonSet, BSL, ConfigMaps, Services, etc.). The applyResourceAnnotations() and applyResourceLabels() functions only merged values in and never removed previously-applied keys.

Additionally, the node-agent DaemonSet appeared unaffected at the pod level because its spec was rebuilt from scratch each reconcile, but its object metadata annotations were also stale. The Velero deployment pod template was affected because metadata annotations leaked into the pod template through an incorrect merge.

Fix

  • Tracking annotations: Each managed resource now carries a tracking annotation that records which annotation/label keys were applied via resourceAnnotations/resourceLabels. On each reconcile, previously-tracked keys are removed before applying the current set. This follows the same pattern used by kubectl apply (kubectl.kubernetes.io/last-applied-configuration).
  • Pod template annotation leak: Removed the merge of deployment metadata annotations into pod template annotations for the Velero deployment, matching the existing node-agent pattern.

Affected resources

All resources using applyResourceAnnotations/applyResourceLabels (12 call sites across 10 files): Velero Deployment, Node-Agent DaemonSet, Node-Agent ConfigMap, BSL, VSL, Monitor Service, Registry Secret, Backup Repository ConfigMap, Repository Maintenance ConfigMap, NonAdmin Controller, KubeVirt DataMover, VMFileRestore Controller.

Fixes: OADP-7869

Test plan

  • Unit tests: 27 test cases covering annotations, labels, and filterOutResourceLabels (nil/empty inputs, stale key cleanup, partial removal, sorted tracking keys, protected label filtering, reserved key protection, idempotency)
  • make test passes (unit tests + lint + go vet + api-isupdated)
  • Live cluster verification: deployed fix, added resourceAnnotations/resourceLabels to DPA, verified they appeared on all managed resources, removed them, verified all stale values were cleaned up across Deployment metadata, DaemonSet metadata, pod templates, BSL, and ConfigMaps
  • CI/Prow E2E tests

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Refactor

    • Consolidated resource label and annotation application logic to operate as single combined steps across multiple resource types and hierarchy levels.
    • Enhanced lifecycle management with tracking of user-provided label and annotation keys for cleanup of stale entries.
    • Streamlined annotation processing by removing redundant separate operations.
  • Tests

    • Extended test coverage for combined label and annotation handling scenarios.
    • Added validation for tracking and removal of stale label and annotation keys.

…fter DPA patch

When resourceAnnotations or resourceLabels were removed from the DPA spec,
stale values persisted on all managed resources (Velero deployment, BSL,
ConfigMaps, Services, etc.) because the apply functions only merged annotations
in and never removed previously-applied keys.

This fix introduces tracking annotations (oadp.openshift.io/managed-resource-
annotation-keys and oadp.openshift.io/managed-resource-label-keys) that record
which keys were applied by resourceAnnotations/resourceLabels. On each reconcile,
previously-tracked keys are removed before applying the current set, ensuring
cleanup when the DPA spec changes.

Additionally fixes a secondary issue where the Velero deployment incorrectly
merged metadata annotations into the pod template via AppendUniqueKeyTOfTMaps,
causing stale annotations to persist on the pod template. Aligned with the
node-agent pattern of passing only PodConfig annotations to the install library.

Fixes: OADP-7869

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 22, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 22, 2026

@shubham-pampattiwar: This pull request references OADP-7869 which is a valid jira issue.

Details

In response to this:

Summary

  • Fix stale resourceAnnotations and resourceLabels not being removed from managed resources when cleared from the DPA spec
  • Introduce tracking annotations (oadp.openshift.io/managed-resource-annotation-keys, oadp.openshift.io/managed-resource-label-keys) to record which keys were applied, enabling cleanup on subsequent reconciles
  • Fix Velero deployment incorrectly merging metadata annotations into the pod template via AppendUniqueKeyTOfTMaps, aligning with the node-agent pattern of passing only PodConfig annotations to the install library

Details

Problem

When resourceAnnotations or resourceLabels were removed from the DPA spec, stale values persisted on all managed resources (Velero deployment, node-agent DaemonSet, BSL, ConfigMaps, Services, etc.). The applyResourceAnnotations() and applyResourceLabels() functions only merged values in and never removed previously-applied keys.

Additionally, the node-agent DaemonSet appeared unaffected at the pod level because its spec was rebuilt from scratch each reconcile, but its object metadata annotations were also stale. The Velero deployment pod template was affected because metadata annotations leaked into the pod template through an incorrect merge.

Fix

  • Tracking annotations: Each managed resource now carries a tracking annotation that records which annotation/label keys were applied via resourceAnnotations/resourceLabels. On each reconcile, previously-tracked keys are removed before applying the current set. This follows the same pattern used by kubectl apply (kubectl.kubernetes.io/last-applied-configuration).
  • Pod template annotation leak: Removed the merge of deployment metadata annotations into pod template annotations for the Velero deployment, matching the existing node-agent pattern.

Affected resources

All resources using applyResourceAnnotations/applyResourceLabels (12 call sites across 10 files): Velero Deployment, Node-Agent DaemonSet, Node-Agent ConfigMap, BSL, VSL, Monitor Service, Registry Secret, Backup Repository ConfigMap, Repository Maintenance ConfigMap, NonAdmin Controller, KubeVirt DataMover, VMFileRestore Controller.

Fixes: OADP-7869

Test plan

  • Unit tests: 27 test cases covering annotations, labels, and filterOutResourceLabels (nil/empty inputs, stale key cleanup, partial removal, sorted tracking keys, protected label filtering, reserved key protection, idempotency)
  • make test passes (unit tests + lint + go vet + api-isupdated)
  • Live cluster verification: deployed fix, added resourceAnnotations/resourceLabels to DPA, verified they appeared on all managed resources, removed them, verified all stale values were cleaned up across Deployment metadata, DaemonSet metadata, pod templates, BSL, and ConfigMaps
  • CI/Prow E2E tests

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 22, 2026

Walkthrough

This PR consolidates how user-provided resource labels and annotations are applied across multiple controller files. The applyResourceLabels function is refactored to accept and return both labels and annotations instead of only labels. Additionally, managed-key tracking via annotations is introduced to track which user-provided label and annotation keys are currently managed, with stale keys being cleaned up during reconciliation.

Changes

Cohort / File(s) Summary
Standard Label/Annotation Consolidation
internal/controller/backup_repository.go, internal/controller/bsl.go, internal/controller/monitor.go, internal/controller/registry.go, internal/controller/repository_maintenance.go, internal/controller/vsl.go
Updated to call applyResourceLabels with both existing labels and annotations, assigning results back to both fields. Subsequent applyResourceAnnotations calls remain for further processing.
Multi-Level Label/Annotation Consolidation
internal/controller/kubevirt_datamover_controller.go, internal/controller/nodeagent.go, internal/controller/nonadmin_controller.go, internal/controller/vmfilerestore_controller.go
Applied label/annotation consolidation at multiple levels (Deployment/DaemonSet metadata and pod template metadata). Related comments updated to reflect combined handling.
Core Label/Annotation Tracking Logic
internal/controller/velero.go
Introduces managed-key tracking via annotations (managedLabelKeysAnnotation, managedAnnotationKeysAnnotation). Updates applyResourceLabels and applyResourceAnnotations to manage lifecycle of user-provided keys: deletes stale tracked keys, updates tracking annotations with sorted comma-separated lists of currently applied keys. Removes merging logic in pod annotation initialization.
Test Coverage Updates
internal/controller/velero_test.go
Extends TestApplyResourceLabels with annotation tracking validation and stale key removal scenarios. Adds TestFilterOutResourceLabels for dedicated filtering verification. Updates TestApplyResourceAnnotations to assert tracking-key management and idempotency.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 10 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ⚠️ Warning The PR description provides comprehensive detail on the problem, solution, affected resources, and test coverage, but lacks explicit testing instructions as required by the template. Add a 'How to test the changes made' section with specific commands and verification steps for testing the fix locally or on a cluster.
✅ Passed checks (10 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically summarizes the main change: fixing stale resourceAnnotations/resourceLabels after DPA patch, matching the core problem addressed across all modified files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All test names use descriptive static strings with no dynamic values, timestamps, IDs, or pod/node/namespace suffixes.
Test Structure And Quality ✅ Passed Tests use table-driven pattern with descriptive assertion messages identifying specific validation failures.
Microshift Test Compatibility ✅ Passed PR adds unit tests to internal/controller/velero_test.go using standard Go testing patterns, not Ginkgo e2e tests.
Single Node Openshift (Sno) Test Compatibility ✅ Passed This PR only modifies unit tests in internal/controller/velero_test.go, not Ginkgo e2e tests in tests/e2e/. No cluster-level e2e tests were added.
Topology-Aware Scheduling Compatibility ✅ Passed PR introduces no scheduling constraints or topology-related assumptions; all changes confined to label and annotation lifecycle management.
Ote Binary Stdout Contract ✅ Passed The pull request does not violate the OTE Binary Stdout Contract. The modified files are all controller package methods with no new process-level entry points. The existing fmt.Printf debug statement is within a controller reconciliation method and guarded by a debugMode flag. The test suite properly configures logging to GinkgoWriter instead of stdout.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed This PR contains only unit tests in internal/controller using Go's standard testing pattern with no Ginkgo e2e tests, network calls, or IPv6 assumptions.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot requested review from mpryc and mrnold April 22, 2026 19:52
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 22, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: shubham-pampattiwar

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [shubham-pampattiwar]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 22, 2026
@shubham-pampattiwar shubham-pampattiwar self-assigned this Apr 22, 2026
@shubham-pampattiwar
Copy link
Copy Markdown
Member Author

/cherry-pick oadp-1.6

@openshift-cherrypick-robot
Copy link
Copy Markdown
Contributor

@shubham-pampattiwar: once the present PR merges, I will cherry-pick it on top of oadp-1.6 in a new PR and assign it to you.

Details

In response to this:

/cherry-pick oadp-1.6

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/controller/velero.go (1)

263-270: ⚠️ Potential issue | 🔴 Critical

Re-assert Velero selector labels after applying resourceLabels.

applyResourceLabels can overwrite k8s-app, component, and deploy on the pod template, but Spec.Selector.MatchLabels stays unchanged. A DPA that sets any of those keys in spec.resourceLabels will make the Deployment selector stop matching its pods.

Proposed fix
  // Apply user-provided resource labels (protected labels are filtered)
  // Note: NOT applied to Spec.Selector.MatchLabels as those are immutable after creation
  veleroDeployment.Labels, veleroDeployment.Annotations = applyResourceLabels(dpa, veleroDeployment.Labels, veleroDeployment.Annotations)
  veleroDeployment.Spec.Template.Labels, veleroDeployment.Spec.Template.Annotations = applyResourceLabels(dpa, veleroDeployment.Spec.Template.Labels, veleroDeployment.Spec.Template.Annotations)
+
+ // Re-assert selector labels so user resourceLabels cannot break the Deployment selector.
+ if veleroDeployment.Spec.Selector != nil && veleroDeployment.Spec.Selector.MatchLabels != nil {
+ 	if veleroDeployment.Spec.Template.Labels == nil {
+ 		veleroDeployment.Spec.Template.Labels = make(map[string]string)
+ 	}
+ 	for k, v := range veleroDeployment.Spec.Selector.MatchLabels {
+ 		veleroDeployment.Spec.Template.Labels[k] = v
+ 	}
+ }
 
  // Apply user-provided resource annotations to both deployment and pod template
  veleroDeployment.Annotations = applyResourceAnnotations(dpa, veleroDeployment.Annotations)
  veleroDeployment.Spec.Template.Annotations = applyResourceAnnotations(dpa, veleroDeployment.Spec.Template.Annotations)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/controller/velero.go` around lines 263 - 270, After applying
applyResourceLabels to veleroDeployment.Spec.Template.Labels, re-assert the
Deployment selector so it continues to match pods: copy the selector keys
"k8s-app", "component", and "deploy" from veleroDeployment.Spec.Template.Labels
into veleroDeployment.Spec.Selector.MatchLabels (upsert those keys only) so the
selector remains consistent with the pod template; perform this update
immediately after the applyResourceLabels calls on
veleroDeployment.Spec.Template.Labels in the veleroDeployment handling code.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/controller/velero.go`:
- Around line 938-963: The stale-key cleanup currently deletes every key listed
in managedAnnotationKeysAnnotation from result before merging current spec
annotations, which removes controller-recreated annotations; instead, merge
dpa.Spec.ResourceAnnotations into result first (the block that iterates
dpa.Spec.ResourceAnnotations and sets keys/result when hasNewAnnotations is
true), then run the tracked-key cleanup: iterate the tracked list from
result[managedAnnotationKeysAnnotation] and only delete a key from result if
that key is NOT present in dpa.Spec.ResourceAnnotations; finally recompute
managedAnnotationKeysAnnotation (using keys and slices.Sort) as you already do.
Use the existing symbols managedAnnotationKeysAnnotation,
managedLabelKeysAnnotation, result, tracked, dpa.Spec.ResourceAnnotations,
hasNewAnnotations and keys to locate and reorder/adjust the logic.

---

Outside diff comments:
In `@internal/controller/velero.go`:
- Around line 263-270: After applying applyResourceLabels to
veleroDeployment.Spec.Template.Labels, re-assert the Deployment selector so it
continues to match pods: copy the selector keys "k8s-app", "component", and
"deploy" from veleroDeployment.Spec.Template.Labels into
veleroDeployment.Spec.Selector.MatchLabels (upsert those keys only) so the
selector remains consistent with the pod template; perform this update
immediately after the applyResourceLabels calls on
veleroDeployment.Spec.Template.Labels in the veleroDeployment handling code.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 0569ca88-e1e2-42fc-b8c6-9520dfb1b687

📥 Commits

Reviewing files that changed from the base of the PR and between c927a46 and eff720f.

📒 Files selected for processing (12)
  • internal/controller/backup_repository.go
  • internal/controller/bsl.go
  • internal/controller/kubevirt_datamover_controller.go
  • internal/controller/monitor.go
  • internal/controller/nodeagent.go
  • internal/controller/nonadmin_controller.go
  • internal/controller/registry.go
  • internal/controller/repository_maintenance.go
  • internal/controller/velero.go
  • internal/controller/velero_test.go
  • internal/controller/vmfilerestore_controller.go
  • internal/controller/vsl.go

Comment on lines +938 to 963
if tracked, ok := result[managedAnnotationKeysAnnotation]; ok {
for _, k := range strings.Split(tracked, ",") {
if k = strings.TrimSpace(k); k != "" {
delete(result, k)
}
}
delete(result, managedAnnotationKeysAnnotation)
}

if !hasNewAnnotations {
return result
}

var keys []string
for k, v := range dpa.Spec.ResourceAnnotations {
if k == managedAnnotationKeysAnnotation || k == managedLabelKeysAnnotation {
continue
}
result[k] = v
keys = append(keys, k)
}

if len(keys) > 0 {
slices.Sort(keys)
result[managedAnnotationKeysAnnotation] = strings.Join(keys, ",")
}
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't delete controller-owned annotations during stale-key cleanup.

This helper drops every key listed in managed-resource-annotation-keys before looking at the current annotation set. If the controller already rebuilt one of those annotations this reconcile, removing it from spec.resourceAnnotations now deletes the operator-owned value too. That breaks fallback to defaults for keys like Velero's Prometheus annotations or kdmDpaResourceVersionAnnotation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/controller/velero.go` around lines 938 - 963, The stale-key cleanup
currently deletes every key listed in managedAnnotationKeysAnnotation from
result before merging current spec annotations, which removes
controller-recreated annotations; instead, merge dpa.Spec.ResourceAnnotations
into result first (the block that iterates dpa.Spec.ResourceAnnotations and sets
keys/result when hasNewAnnotations is true), then run the tracked-key cleanup:
iterate the tracked list from result[managedAnnotationKeysAnnotation] and only
delete a key from result if that key is NOT present in
dpa.Spec.ResourceAnnotations; finally recompute managedAnnotationKeysAnnotation
(using keys and slices.Sort) as you already do. Use the existing symbols
managedAnnotationKeysAnnotation, managedLabelKeysAnnotation, result, tracked,
dpa.Spec.ResourceAnnotations, hasNewAnnotations and keys to locate and
reorder/adjust the logic.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a false positive. The tracking annotation only records keys that the user explicitly set via dpa.Spec.ResourceAnnotations. It never tracks operator-owned annotations like prometheus.io/scrape or deployment.kubernetes.io/revision -- those keys would only appear in the tracking list if the user explicitly added them to resourceAnnotations, in which case removing them on cleanup is the correct behavior.

The concern about Prometheus annotations or kdmDpaResourceVersionAnnotation being deleted doesn't apply because:

  • Prometheus annotations are set on the pod template, not on deployment metadata where this function operates
  • Operator-internal annotations are never added to the tracking list unless a user explicitly puts them in resourceAnnotations

Both orderings (delete-then-merge vs merge-then-delete-unmatched) produce identical results since step 3 re-adds any key still in the current spec regardless.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shubham-pampattiwar, you're right — I've re-traced the logic and agree this is a false positive.

The invariant holds: the tracking annotation only ever records keys sourced from dpa.Spec.ResourceAnnotations, so:

  1. Operator-owned annotations (prometheus.io/*, kdmDpaResourceVersionAnnotation, etc.) are never added to the tracking list unless the user explicitly puts them in resourceAnnotations.
  2. In the delete-then-merge flow, any key that is still in the current spec is unconditionally re-added by the dpa.Spec.ResourceAnnotations loop, making both orderings equivalent.
  3. The only keys permanently removed are those that were tracked but are absent from the current spec — which is precisely the stale-cleanup intent.

My concern was unfounded. The implementation is correct as-is.

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 22, 2026

@shubham-pampattiwar: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/4.23-e2e-test-aws eff720f link false /test 4.23-e2e-test-aws

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants