Skip to content

[PR-3] Runtime Integration#1954

Open
khansaad wants to merge 7 commits into
kruize:runtimes-iirjfrom
khansaad:mds-pr-3-runtime-integration
Open

[PR-3] Runtime Integration#1954
khansaad wants to merge 7 commits into
kruize:runtimes-iirjfrom
khansaad:mds-pr-3-runtime-integration

Conversation

@khansaad

@khansaad khansaad commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Description

This PR integrates multi-datasource layer handling into the runtime recommendation flow so detected layer information is available during recommendation processing.

Fixes # (issue)

Type of change

  • Bug fix
  • New feature
  • Docs update
  • Breaking change (What changes might users need to make in their application due to this PR?)
  • Requires DB changes

How has this been tested?

Please describe the tests that were run to verify your changes and steps to reproduce. Please specify any test configuration required.

  • New Test X
  • Functional testsuite

Test Configuration

  • Kubernetes clusters tested on:

Checklist 🎯

  • Followed coding guidelines
  • Comments added
  • Dependent changes merged
  • Documentation updated
  • Tests added or updated

Additional information

Include any additional information such as links, test results, screenshots here

Summary by Sourcery

Add multi-datasource support to experiments and integrate it into layer detection, validation, and recommendation flows while preserving backward compatibility with the existing single datasource field.

New Features:

  • Support configuring multiple datasources per experiment via new datasources fields in API and internal model.
  • Enable runtime recommendation and metrics collection flows to work with multi-datasource setups, requiring a Prometheus datasource for PromQL-based queries.

Bug Fixes:

  • Prevent attempts to run runtime recommendations or metrics collection when no serviceable datasource is configured, emitting clear validation and runtime warnings instead.

Enhancements:

  • Extend layer detection to use configured datasources and surface detected layer information into recommendation processing.
  • Improve datasource validation to check all configured datasource names and provide clearer error messages when none are serviceable.
  • Adjust accelerator detection logic to skip Prometheus-specific checks when only non-Prometheus datasources such as Cryostat are available.

@sourcery-ai

sourcery-ai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Reviewer's Guide

Integrates multi-datasource awareness (with Prometheus as mandatory for metrics) throughout experiment validation, layer detection, and recommendation runtime, while preserving backward compatibility with the legacy single datasource field and adding Cryostat-specific handling for metrics and accelerator detection paths.

Sequence diagram for Prometheus datasource selection in RecommendationEngine

sequenceDiagram
    participant RecommendationEngine
    participant KruizeObject
    participant DataSourceCollection
    participant DataSourceInfo

    RecommendationEngine->>KruizeObject: getDatasources()
    alt datasources not empty
        loop for each ds in datasources
            RecommendationEngine->>DataSourceCollection: getInstance()
            RecommendationEngine->>DataSourceCollection: getDataSourcesCollection()
            RecommendationEngine->>DataSourceCollection: get(ds)
            DataSourceCollection-->>RecommendationEngine: DataSourceInfo
            alt DataSourceInfo not null
                RecommendationEngine->>DataSourceInfo: getProvider()
                alt provider is PROMETHEUS
                    RecommendationEngine->>RecommendationEngine: set dataSource = ds
                    RecommendationEngine->>RecommendationEngine: break
                end
            end
        end
    else datasources empty
        RecommendationEngine->>KruizeObject: getDataSource()
        alt dataSource not null
            RecommendationEngine->>DataSourceCollection: getInstance()
            RecommendationEngine->>DataSourceCollection: getDataSourcesCollection()
            RecommendationEngine->>DataSourceCollection: get(dataSource)
            DataSourceCollection-->>RecommendationEngine: DataSourceInfo
            alt DataSourceInfo null or provider not PROMETHEUS
                RecommendationEngine->>RecommendationEngine: set dataSource = null
            end
        end
    end

    alt dataSource is null
        RecommendationEngine->>RecommendationEngine: throw Exception
    end
Loading

Sequence diagram for PromQL datasource resolution with Cryostat

sequenceDiagram
    participant RecommendationEngine
    participant DataSourceCollection
    participant DataSourceInfo

    RecommendationEngine->>RecommendationEngine: fetchMetricsBasedOnProfileAndDatasource(..., dataSourceInfo, ...)
    alt [dataSourceInfo.provider is CRYOSTAT]
        RecommendationEngine->>RecommendationEngine: promQLDataSourceInfo = dataSourceInfo
        RecommendationEngine->>DataSourceCollection: getInstance()
        RecommendationEngine->>DataSourceCollection: getDataSourcesCollection()
        loop for each ds in values()
            RecommendationEngine->>DataSourceInfo: getProvider()
            alt [provider is PROMETHEUS]
                RecommendationEngine->>RecommendationEngine: promQLDataSourceInfo = ds
                RecommendationEngine->>RecommendationEngine: loop break
            end
        end
    else [dataSourceInfo.provider is not CRYOSTAT]
        RecommendationEngine->>RecommendationEngine: promQLDataSourceInfo = dataSourceInfo
    end

    RecommendationEngine->>RecommendationEngine: fetchContainerMetricsBasedOnDataSourceAndProfile(..., promQLDataSourceInfo, ...)

    alt [promQLDataSourceInfo.provider is PROMETHEUS]
        RecommendationEngine->>RecommendationEngine: accelerator detection enabled
    else [non-PROMETHEUS]
        RecommendationEngine->>RecommendationEngine: skip accelerator detection
    end
Loading

File-Level Changes

Change Details Files
Add multi-datasource support to experiment and API models with backward compatibility for the legacy single datasource field.
  • Deprecate single datasource field but retain it for backward compatibility in KruizeObject and CreateExperimentAPIObject.
  • Introduce datasources list field on both experiment domain and API objects with getters/setters.
  • Implement getDatasources() helpers that adapt a legacy single datasource into a list or return an empty list when unset.
  • Wire datasources from CreateExperimentAPIObject into KruizeObject in the Converters utility.
src/main/java/com/autotune/analyzer/kruizeObject/KruizeObject.java
src/main/java/com/autotune/analyzer/serviceObjects/CreateExperimentAPIObject.java
src/main/java/com/autotune/analyzer/serviceObjects/Converters.java
Update experiment validation logic to work with multiple datasources and enforce that at least one valid datasource is configured for local monitoring.
  • Change datasource existence validation to iterate over kruizeObject.getDatasources() and validate each name against DataSourceCollection.
  • Return a specific error when no serviceable datasources are configured rather than failing only on a missing single datasource.
  • Update mandatory field validation for local monitoring to accept either the legacy single datasource or a non-empty datasources list.
src/main/java/com/autotune/analyzer/experiment/ExperimentValidation.java
Extend layer detection to run against configured datasources and aggregate layer info per container with robust logging and error handling.
  • Refactor ServiceHelpers.detectLayers to obtain datasources from CreateExperimentAPIObject.getDatasources() with null/empty guards and logging.
  • Iterate through Kubernetes and container objects, calling LayerUtils.detectLayers with the namespace, container name, and datasources list.
  • Aggregate detected layers into a per-container map; on success, persist it via containerAPIObject.setLayerMap and log counts; log and swallow exceptions per-container to avoid failing the whole flow.
src/main/java/com/autotune/analyzer/utils/ServiceHelpers.java
Make the recommendation engine multi-datasource-aware, requiring a Prometheus datasource for metrics collection, and add Cryostat-specific behavior for PromQL queries and accelerator detection.
  • Replace simple kruizeObject.getDataSource() usage with logic that selects a datasource whose provider is Prometheus from kruizeObject.getDatasources() or validates the legacy single datasource as Prometheus-only.
  • Throw an exception if no Prometheus datasource can be resolved for an experiment.
  • In fetchMetricsBasedOnProfileAndDatasource, introduce promQLDataSourceInfo which for Cryostat providers is reassigned to a Prometheus datasource from DataSourceCollection for use with PromQL queries; log when none is found.
  • Ensure both container and namespace metric fetchers use promQLDataSourceInfo rather than the original Cryostat datasource for PromQL-based queries.
  • Gate accelerator and accelerator-partition detection logic so it runs only when the underlying datasource provider is Prometheus, effectively skipping those steps for Cryostat-only scenarios.
src/main/java/com/autotune/analyzer/recommendations/engine/RecommendationEngine.java
Align runtime recommendation processing with multi-datasource semantics while keeping legacy behavior when any datasource is present.
  • Adjust RuntimeRecommendationProcessor.handleRuntimeRecommendations to treat missing datasource as "no data" only when both the legacy single datasource and the datasources list are absent or empty, otherwise proceed.
  • Maintain existing use of container layer maps, which are now populated via the multi-datasource-aware layer detection.
src/main/java/com/autotune/analyzer/recommendations/engine/RuntimeRecommendationProcessor.java
Extend analyzer constants to support pod label usage in presence detection logic.
  • Add LABEL_POD constant to AnalyzerConstants.RuntimeLayerConstants for future/use in PromQL label-based layer presence detection.
src/main/java/com/autotune/analyzer/utils/AnalyzerConstants.java

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@khansaad khansaad moved this to Under Review in Monitoring Jun 11, 2026

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • When falling back from a Cryostat datasource to Prometheus in fetchMetricsBasedOnProfileAndDatasource, you currently select the first Prometheus entry from the global DataSourceCollection; consider restricting this to the experiment's configured datasources to avoid silently using an unintended Prometheus instance.
  • RuntimeRecommendationProcessor still treats the datasource as effectively singular (only gating on presence of getDataSource/getDatasources); if multiple datasources are configured, it may be worth explicitly selecting or validating which one drives runtime recommendations to avoid ambiguous behavior.
  • The datasource validation logic mixes direct field checks (getDataSource() + null checks on getDatasources()) with getDatasources()'s backward-compatible behavior; consider standardizing on getDatasources() everywhere to simplify the conditions and avoid subtle differences between validation paths.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- When falling back from a Cryostat datasource to Prometheus in `fetchMetricsBasedOnProfileAndDatasource`, you currently select the first Prometheus entry from the global `DataSourceCollection`; consider restricting this to the experiment's configured datasources to avoid silently using an unintended Prometheus instance.
- RuntimeRecommendationProcessor still treats the datasource as effectively singular (only gating on presence of `getDataSource`/`getDatasources`); if multiple datasources are configured, it may be worth explicitly selecting or validating which one drives runtime recommendations to avoid ambiguous behavior.
- The datasource validation logic mixes direct field checks (`getDataSource()` + null checks on `getDatasources()`) with `getDatasources()`'s backward-compatible behavior; consider standardizing on `getDatasources()` everywhere to simplify the conditions and avoid subtle differences between validation paths.

## Individual Comments

### Comment 1
<location path="src/main/java/com/autotune/analyzer/serviceObjects/CreateExperimentAPIObject.java" line_range="177-178" />
<code_context>
+     */
+    public List<String> getDatasources() {
+        // Backward compatibility: if datasources is null but datasource is set
+        if (datasources == null && datasource != null) {
+            return Arrays.asList(datasource);
+        }
</code_context>
<issue_to_address>
**issue (bug_risk):** Using `List.of` for backward compatibility may cause issues on Java 8 runtimes.

`KruizeObject#getDatasources` already uses `Arrays.asList(datasource)` for this backward-compatibility case. For consistency and to avoid Java 8 compilation issues, use `Collections.singletonList(datasource)` or `Arrays.asList(datasource)` instead of `List.of(datasource)` here.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Signed-off-by: Saad Khan <saakhan@ibm.com>
Signed-off-by: Saad Khan <saakhan@ibm.com>
@khansaad khansaad force-pushed the mds-pr-3-runtime-integration branch from 03b6cd6 to 7e13736 Compare June 11, 2026 20:00
khansaad added 5 commits June 12, 2026 01:39
Signed-off-by: Saad Khan <saakhan@ibm.com>
Signed-off-by: Saad Khan <saakhan@ibm.com>
Signed-off-by: Saad Khan <saakhan@ibm.com>
Signed-off-by: Saad Khan <saakhan@ibm.com>
Signed-off-by: Saad Khan <saakhan@ibm.com>
@khansaad khansaad force-pushed the mds-pr-3-runtime-integration branch from 7e13736 to 8b5aaac Compare June 11, 2026 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: Under Review

Development

Successfully merging this pull request may close these issues.

2 participants