Skip to content

Update Current Config & variation to have accelerator info#1935

Open
bharathappali wants to merge 11 commits into
kruize:mvp_demofrom
bharathappali:gpu-cur-var-2
Open

Update Current Config & variation to have accelerator info#1935
bharathappali wants to merge 11 commits into
kruize:mvp_demofrom
bharathappali:gpu-cur-var-2

Conversation

@bharathappali

@bharathappali bharathappali commented Jun 4, 2026

Copy link
Copy Markdown
Member

Description

Continuation to #1921

PR #1921 needs to be merged before this PR

This PR adds the updates to the current config, adds new mechanism for adding accelerator info in current and variation

Fixes #1920

Type of change

  • Bug fix
  • New feature
  • Docs update
  • Breaking change (What changes might users need to make in their application due to this PR?)
  • Requires DB changes

How has this been tested?

Please describe the tests that were run to verify your changes and steps to reproduce. Please specify any test configuration required.

  • New Test X
  • Functional testsuite

Test Configuration

  • Kubernetes clusters tested on:

Checklist 🎯

  • Followed coding guidelines
  • Comments added
  • Dependent changes merged
  • Documentation updated
  • Tests added or updated

Additional information

Include any additional information such as links, test results, screenshots here

Summary by Sourcery

Add support for representing accelerator resource information alongside CPU and memory in current and recommended configurations.

New Features:

  • Introduce generic ResourceRecommendation types, including structured accelerator recommendations with compute and memory details.
  • Capture accelerator metrics from interval results to derive current accelerator configuration for containers and namespaces.
  • Serialize multi-resource (accelerator) recommendations as JSON arrays via a custom adapter, and expose them through extended configuration structures.

Enhancements:

  • Refactor current configuration handling to use ResourceRecommendation instead of raw RecommendationConfigItem, preparing the model for accelerator-aware recommendations.
  • Extend recommendation items with a non-Kubernetes ACCELERATORS key for internal JSON representation of accelerator resources.

…ion and current

Signed-off-by: bharathappali <abharath@redhat.com>
Signed-off-by: bharathappali <abharath@redhat.com>
Signed-off-by: bharathappali <abharath@redhat.com>
Signed-off-by: bharathappali <abharath@redhat.com>
Signed-off-by: bharathappali <abharath@redhat.com>
Signed-off-by: bharathappali <abharath@redhat.com>
Signed-off-by: bharathappali <abharath@redhat.com>
Signed-off-by: bharathappali <abharath@redhat.com>
…d fail fast

Signed-off-by: bharathappali <abharath@redhat.com>
Signed-off-by: bharathappali <abharath@redhat.com>
Signed-off-by: bharathappali <abharath@redhat.com>
@bharathappali bharathappali requested a review from khansaad June 4, 2026 08:36
@sourcery-ai

sourcery-ai Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Reviewer's Guide

Extends the recommendation engine’s current configuration model to support accelerator (GPU) metadata alongside CPU/memory, introducing polymorphic resource recommendations and utilities to derive accelerator info from metrics and serialize it cleanly in the API.

Sequence diagram for deriving current accelerator recommendation from metrics

sequenceDiagram
    participant RecommendationUtils
    participant filteredResultsMap
    participant IntervalResults
    participant MetricResults
    participant AcceleratorMetricMetadata
    participant MultiResourceRecommendation
    participant AcceleratorRecommendationItem

    RecommendationUtils->>filteredResultsMap: get(timestampToExtract)
    alt hasAcceleratorData
        RecommendationUtils->>RecommendationUtils: hasAcceleratorData(IntervalResults)
    else findLatestAcceleratorTimestamp
        RecommendationUtils->>RecommendationUtils: findLatestAcceleratorTimestamp(filteredResultsMap, timestampToExtract)
        RecommendationUtils->>filteredResultsMap: get(realTimestampToExtract)
    end

    RecommendationUtils->>IntervalResults: getMetricResultsMap()
    loop for metric in ACCELERATOR_METRICS
        RecommendationUtils->>IntervalResults: getMetricResultsMap().get(metric)
        IntervalResults-->>RecommendationUtils: MetricResults
        alt metricResult not null
            RecommendationUtils->>MetricResults: getMetadata()
            MetricResults-->>RecommendationUtils: AcceleratorMetricMetadata
            RecommendationUtils->>RecommendationUtils: getAcceleratorRecommendationItem(AcceleratorMetricMetadata)
            RecommendationUtils-->>AcceleratorRecommendationItem: AcceleratorRecommendationItem
        end
    end

    RecommendationUtils->>MultiResourceRecommendation: new MultiResourceRecommendation()
    RecommendationUtils->>MultiResourceRecommendation: addAcceleratorRecommendationItem(AcceleratorRecommendationItem)
    RecommendationUtils-->>MultiResourceRecommendation: return MultiResourceRecommendation
Loading

File-Level Changes

Change Details Files
Introduce accelerator-aware recommendation model and utilities to derive accelerator recommendations from metrics
  • Define ACCELERATOR_METRICS set and helper methods to detect accelerator data and find latest timestamp with such data
  • Add logic to build AcceleratorRecommendationItem and wrap it in MultiResourceRecommendation based on accelerator metric metadata, including MIG profile parsing and GPU model lookup
  • Add helper methods to compute GPU slices and memory GiB from full GPU model or MIG profile strings
src/main/java/com/autotune/analyzer/recommendations/utils/RecommendationUtils.java
src/main/java/com/autotune/analyzer/recommendations/AcceleratorRecommendationItem.java
src/main/java/com/autotune/analyzer/recommendations/MultiResourceRecommendation.java
Generalize current configuration maps to support heterogeneous resource recommendations (CPU, memory, accelerators)
  • Change currentConfig map types from RecommendationConfigItem to ResourceRecommendation across processors and DTOs
  • Update extraction logic to cast CPU and memory entries from ResourceRecommendation back to RecommendationConfigItem
  • Introduce ExtendedConfig and CurrentContainerConfigValues to carry container-level config including future accelerator fields
src/main/java/com/autotune/analyzer/recommendations/engine/BaseRecommendationProcessor.java
src/main/java/com/autotune/analyzer/recommendations/engine/ContainerRecommendationProcessor.java
src/main/java/com/autotune/analyzer/recommendations/engine/NamespaceRecommendationProcessor.java
src/main/java/com/autotune/analyzer/recommendations/objects/MappedRecommendationForTimestamp.java
src/main/java/com/autotune/analyzer/recommendations/ExtendedConfig.java
Add polymorphic resource recommendation abstraction and JSON (de)serialization for accelerator arrays
  • Introduce ResourceRecommendation marker interface and have RecommendationConfigItem and MultiResourceRecommendation implement it
  • Add a MultiResourceRecommendationAdapter to serialize MultiResourceRecommendation as a plain JSON array and deserialize arrays back into wrapper objects, returning null on empty lists
  • Extend RecommendationItem enum with ACCELERATORS to represent non-Kubernetes accelerator entries in current config
src/main/java/com/autotune/analyzer/recommendations/RecommendationConfigItem.java
src/main/java/com/autotune/analyzer/adapters/MultiResourceRecommendationAdapter.java
src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java
src/main/java/com/autotune/analyzer/recommendations/ResourceRecommendation.java

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@bharathappali bharathappali requested a review from mbvreddy June 4, 2026 08:36
@bharathappali bharathappali self-assigned this Jun 4, 2026
@bharathappali bharathappali moved this to In Progress in Monitoring Jun 4, 2026
@bharathappali bharathappali added this to the Kruize 0.12.0 Release milestone Jun 4, 2026

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 4 issues, and left some high level feedback:

  • The new extractCurrentContainerConfig method appears incomplete: currentAcceleratorRequest/currentAcceleratorLimit are unused and the ACCELERATORS branch in the limits map is empty, so either wire these through to CurrentContainerConfigValues or remove the dead code.
  • The change to use ResourceRecommendation in the currentConfigMap introduces multiple unchecked casts back to RecommendationConfigItem; consider a more type-safe approach (e.g., generics or separate accessors for CPU/memory vs multi-resource types) to avoid potential ClassCastExceptions and make the API clearer.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The new `extractCurrentContainerConfig` method appears incomplete: `currentAcceleratorRequest`/`currentAcceleratorLimit` are unused and the `ACCELERATORS` branch in the limits map is empty, so either wire these through to `CurrentContainerConfigValues` or remove the dead code.
- The change to use `ResourceRecommendation` in the `currentConfigMap` introduces multiple unchecked casts back to `RecommendationConfigItem`; consider a more type-safe approach (e.g., generics or separate accessors for CPU/memory vs multi-resource types) to avoid potential `ClassCastException`s and make the API clearer.

## Individual Comments

### Comment 1
<location path="src/main/java/com/autotune/analyzer/recommendations/engine/BaseRecommendationProcessor.java" line_range="250-251" />
<code_context>
+                    null != limitsMap.get(AnalyzerConstants.RecommendationItem.MEMORY)) {
+                currentMemLimit = (RecommendationConfigItem) limitsMap.get(AnalyzerConstants.RecommendationItem.MEMORY);
+            }
+            if (limitsMap.containsKey(AnalyzerConstants.RecommendationItem.ACCELERATORS) &&
+                    null != limitsMap.get(AnalyzerConstants.RecommendationItem.ACCELERATORS)) {
+
+            }
</code_context>
<issue_to_address>
**issue (bug_risk):** Accelerator limits branch is empty and the collected accelerator variables are unused, leaving this method incomplete.

The method declares `currentAcceleratorRequest`/`currentAcceleratorLimit` and checks `RecommendationItem.ACCELERATORS` in `limitsMap`, but never reads the value or sets the accelerator fields on `CurrentContainerConfigValues`. As a result, callers can’t obtain the current accelerator config. Please wire this through by extracting the accelerator recommendation(s) from `limitsMap` and setting the corresponding fields on `CurrentContainerConfigValues`.
</issue_to_address>

### Comment 2
<location path="src/main/java/com/autotune/analyzer/recommendations/engine/BaseRecommendationProcessor.java" line_range="314-316" />
<code_context>
+            return acceleratorRequestRecommendationItems;
+        }
+
+        public void setAcceleratorRequestRecommendationItems(List<AcceleratorRecommendationItem> acceleratorRequestRecommendationItems) {
+            if (null != acceleratorRequestRecommendationItems && !acceleratorRequestRecommendationItems.isEmpty())
+                this.acceleratorRequestRecommendationItems = acceleratorRequestRecommendationItems;
+        }
+
</code_context>
<issue_to_address>
**suggestion:** Setters silently ignore null/empty lists, which may surprise callers expecting to clear the accelerator recommendations.

Because these setters only assign when the argument is non-null and non-empty, callers cannot clear existing recommendations and may unintentionally leave stale data when passing an empty list. Either allow empty lists (remove the `isEmpty` check) so callers can clear the state, or make these fields immutable and only set via constructor to make the behavior explicit.
</issue_to_address>

### Comment 3
<location path="src/main/java/com/autotune/analyzer/recommendations/MultiResourceRecommendation.java" line_range="35-36" />
<code_context>
+        this.acceleratorRecommendationItems = new ArrayList<>();
+    }
+
+    public MultiResourceRecommendation(List<AcceleratorRecommendationItem> acceleratorRecommendationItems) {
+        this.acceleratorRecommendationItems = acceleratorRecommendationItems;
+    }
+
</code_context>
<issue_to_address>
**issue (bug_risk):** Constructor allows a null list which can cause NPEs in addAcceleratorRecommendationItem.

The second constructor assigns the parameter directly, so `new MultiResourceRecommendation(null)` followed by `addAcceleratorRecommendationItem` will NPE. Normalize `null` to an empty list in the constructor or lazily initialize the list inside `addAcceleratorRecommendationItem`.
</issue_to_address>

### Comment 4
<location path="src/main/java/com/autotune/analyzer/recommendations/utils/RecommendationUtils.java" line_range="761-766" />
<code_context>
+            return false;
+        }
+
+        for (AnalyzerConstants.MetricName metric : ACCELERATOR_METRICS) {
+            MetricResults metricResult =
+                    intervalResults.getMetricResultsMap().get(metric);
+
+            if (metricResult == null || metricResult.getMetadata() == null) {
+                continue;
+            }
+
+            if (AnalyzerConstants.DeviceType.ACCELERATOR.toString()
+                    .equalsIgnoreCase(metricResult.getMetadata().getType())) {
+                AcceleratorMetricMetadata acceleratorMetricMetadata = (AcceleratorMetricMetadata) metricResult.getMetadata();
+                if (null != acceleratorMetricMetadata.getModelName())
+                    return true;
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Casting metadata to AcceleratorMetricMetadata without an instanceof check assumes a strong coupling that may not always hold.

In `hasAcceleratorData`, `metadata.getType()` is checked against `DeviceType.ACCELERATOR`, then `metadata` is blindly cast to `AcceleratorMetricMetadata`. If another `MetricMetadata` implementation returns the same type, this will cause a `ClassCastException`. Either guard the cast with `instanceof AcceleratorMetricMetadata` or enforce the concrete type at the wiring/creation point so this assumption is guaranteed.

```suggestion
            if (AnalyzerConstants.DeviceType.ACCELERATOR.toString()
                    .equalsIgnoreCase(metricResult.getMetadata().getType())
                    && metricResult.getMetadata() instanceof AcceleratorMetricMetadata) {
                AcceleratorMetricMetadata acceleratorMetricMetadata =
                        (AcceleratorMetricMetadata) metricResult.getMetadata();
                if (acceleratorMetricMetadata.getModelName() != null) {
                    return true;
                }
            }
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +250 to +251
if (limitsMap.containsKey(AnalyzerConstants.RecommendationItem.ACCELERATORS) &&
null != limitsMap.get(AnalyzerConstants.RecommendationItem.ACCELERATORS)) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Accelerator limits branch is empty and the collected accelerator variables are unused, leaving this method incomplete.

The method declares currentAcceleratorRequest/currentAcceleratorLimit and checks RecommendationItem.ACCELERATORS in limitsMap, but never reads the value or sets the accelerator fields on CurrentContainerConfigValues. As a result, callers can’t obtain the current accelerator config. Please wire this through by extracting the accelerator recommendation(s) from limitsMap and setting the corresponding fields on CurrentContainerConfigValues.

Comment on lines +314 to +316
public void setAcceleratorRequestRecommendationItems(List<AcceleratorRecommendationItem> acceleratorRequestRecommendationItems) {
if (null != acceleratorRequestRecommendationItems && !acceleratorRequestRecommendationItems.isEmpty())
this.acceleratorRequestRecommendationItems = acceleratorRequestRecommendationItems;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Setters silently ignore null/empty lists, which may surprise callers expecting to clear the accelerator recommendations.

Because these setters only assign when the argument is non-null and non-empty, callers cannot clear existing recommendations and may unintentionally leave stale data when passing an empty list. Either allow empty lists (remove the isEmpty check) so callers can clear the state, or make these fields immutable and only set via constructor to make the behavior explicit.

Comment on lines +35 to +36
public MultiResourceRecommendation(List<AcceleratorRecommendationItem> acceleratorRecommendationItems) {
this.acceleratorRecommendationItems = acceleratorRecommendationItems;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Constructor allows a null list which can cause NPEs in addAcceleratorRecommendationItem.

The second constructor assigns the parameter directly, so new MultiResourceRecommendation(null) followed by addAcceleratorRecommendationItem will NPE. Normalize null to an empty list in the constructor or lazily initialize the list inside addAcceleratorRecommendationItem.

Comment on lines +761 to +766
if (AnalyzerConstants.DeviceType.ACCELERATOR.toString()
.equalsIgnoreCase(metricResult.getMetadata().getType())) {
AcceleratorMetricMetadata acceleratorMetricMetadata = (AcceleratorMetricMetadata) metricResult.getMetadata();
if (null != acceleratorMetricMetadata.getModelName())
return true;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): Casting metadata to AcceleratorMetricMetadata without an instanceof check assumes a strong coupling that may not always hold.

In hasAcceleratorData, metadata.getType() is checked against DeviceType.ACCELERATOR, then metadata is blindly cast to AcceleratorMetricMetadata. If another MetricMetadata implementation returns the same type, this will cause a ClassCastException. Either guard the cast with instanceof AcceleratorMetricMetadata or enforce the concrete type at the wiring/creation point so this assumption is guaranteed.

Suggested change
if (AnalyzerConstants.DeviceType.ACCELERATOR.toString()
.equalsIgnoreCase(metricResult.getMetadata().getType())) {
AcceleratorMetricMetadata acceleratorMetricMetadata = (AcceleratorMetricMetadata) metricResult.getMetadata();
if (null != acceleratorMetricMetadata.getModelName())
return true;
}
if (AnalyzerConstants.DeviceType.ACCELERATOR.toString()
.equalsIgnoreCase(metricResult.getMetadata().getType())
&& metricResult.getMetadata() instanceof AcceleratorMetricMetadata) {
AcceleratorMetricMetadata acceleratorMetricMetadata =
(AcceleratorMetricMetadata) metricResult.getMetadata();
if (acceleratorMetricMetadata.getModelName() != null) {
return true;
}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

1 participant