Skip to content

Add support for Cluster, Custom Terms and Models in Bulk API#1969

Open
khansaad wants to merge 3 commits into
kruize:runtimes-iirjfrom
khansaad:feature/bulk-api-cluster-settings-clean
Open

Add support for Cluster, Custom Terms and Models in Bulk API#1969
khansaad wants to merge 3 commits into
kruize:runtimes-iirjfrom
khansaad:feature/bulk-api-cluster-settings-clean

Conversation

@khansaad

@khansaad khansaad commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Description

  • Add cluster_name, model_settings, term_settings fields to BulkInput
  • Implement passthrough logic in BulkJobManager
  • Add comprehensive validation for new fields
  • Maintain backward compatibility (all fields optional)
  • Enables Optimizer to specify:
  • Cluster name for experiments
  • Custom recommendation models (performance, cost)
  • Custom recommendation terms (short, medium, long)

Fixes # (issue)

Type of change

  • Bug fix
  • New feature
  • Docs update
  • Breaking change (What changes might users need to make in their application due to this PR?)
  • Requires DB changes

How has this been tested?

Please describe the tests that were run to verify your changes and steps to reproduce. Please specify any test configuration required.

  • New Test X
  • Functional testsuite

Test Configuration

  • Kubernetes clusters tested on:

Checklist 🎯

  • Followed coding guidelines
  • Comments added
  • Dependent changes merged
  • Documentation updated
  • Tests added or updated

Additional information

Include any additional information such as links, test results, screenshots here

Summary by Sourcery

Extend Bulk API and recommendation engine to support cluster-aware configuration and customizable recommendation models/terms, while enhancing datasource metadata and notifications.

New Features:

  • Add optional cluster_name, model_settings, and term_settings fields to Bulk API requests and propagate them into experiment creation and recommendation settings.
  • Allow datasources to be associated with one or more clusters, exposing cluster information via the List Datasources API and persisting it in the database with a schema migration.
  • Introduce new recommendation notifications indicating CPU and memory under- and over-provisioning based on generated recommendations.

Enhancements:

  • Tighten Bulk API validation for cluster_name, model_settings, and term_settings, including format, allowed values, and boundary checks.
  • Improve Bulk API and datasource documentation to describe cluster association, new Bulk API fields, and their behavior.

Build:

  • Add a database migration to store associated clusters on kruize_datasources entries.

Documentation:

  • Document Bulk API support for cluster_name, model_settings, and term_settings, including examples and usage scenarios.
  • Document datasource cluster association semantics, configuration, and integration with optimizer, and extend notification codes documentation for new provisioning status notifications.

Tests:

  • Add unit tests for BulkServiceValidation, BulkJobManager passthrough behavior, and KruizeDataSourceEntry cluster handling, plus new REST tests for Bulk API validation, combined custom settings, and backward compatibility.

@khansaad khansaad self-assigned this Jun 18, 2026
@khansaad khansaad added the enhancement New feature or request label Jun 18, 2026
@khansaad khansaad moved this to In Progress in Monitoring Jun 18, 2026
@sourcery-ai

sourcery-ai Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Reviewer's Guide

Adds optional cluster_name, model_settings, and term_settings to the Bulk API and Bulk job pipeline, including validation, passthrough into experiment creation, datasource cluster association, and new recommendation notifications, while keeping backward compatibility and extending docs and tests.

Sequence diagram for Bulk API cluster and recommendation settings passthrough

sequenceDiagram
    actor Optimizer
    participant BulkAPI as BulkJobManager
    participant Validation as BulkServiceValidation
    participant DSc as DataSourceConfig
    participant ExpReq as CreateExperimentAPIObject

    Optimizer->>BulkAPI: submitBulkJob(BulkInput)
    BulkAPI->>Validation: validate(payload, jobID)
    Validation-->>BulkAPI: ValidationOutputData
    alt [validation passes]
        BulkAPI->>BulkAPI: prepareCreateExperimentJSONInput(dsc, namespace)
        BulkAPI->>DSc: getDataSourceClusterName()
        DSc-->>BulkAPI: clusterNameFromMetadata
        alt [bulkInput.getCluster_name() provided and non-empty]
            BulkAPI->>ExpReq: setClusterName(cluster_name from BulkInput)
        else [no cluster_name in BulkInput]
            BulkAPI->>ExpReq: setClusterName(clusterNameFromMetadata)
        end
        BulkAPI->>ExpReq: setRecommendationSettings(RecommendationSettings)
        alt [bulkInput.getModel_settings() not null]
            BulkAPI->>ExpReq: RecommendationSettings.setModelSettings(model_settings)
        end
        alt [bulkInput.getTerm_settings() not null]
            BulkAPI->>ExpReq: RecommendationSettings.setTermSettings(term_settings)
        end
    else [validation fails]
        BulkAPI-->>Optimizer: error response
    end
Loading

File-Level Changes

Change Details Files
Extend Bulk API payload and job processing to support custom cluster, model, and term settings with passthrough into experiment creation.
  • Add cluster_name, model_settings, and term_settings fields to BulkInput with getters/setters and update isEmpty check to consider new fields
  • Update BulkJobManager.prepareCreateExperimentJSONInput to override metadata cluster with trimmed bulk cluster_name when provided
  • Update BulkJobManager.prepareCreateExperimentJSONInput to set RecommendationSettings.modelSettings and termSettings from BulkInput when present, preserving existing threshold behavior
src/main/java/com/autotune/analyzer/serviceObjects/BulkInput.java
src/main/java/com/autotune/analyzer/workerimpl/BulkJobManager.java
src/test/java/com/autotune/analyzer/workerimpl/BulkJobManagerPassthroughTest.java
Implement validation for cluster_name, model_settings, and term_settings in the Bulk service layer and cover it with unit and REST tests.
  • Add DNS-1123-based cluster_name validation, model/term name whitelists, and wiring into BulkServiceValidation.validate()
  • Create JUnit tests for validateClusterName, validateModelSettings, and validateTermSettings, including edge cases
  • Add REST bulk API tests validating cluster_name, model_settings, term_settings, their combinations, and backward compatibility
src/main/java/com/autotune/common/bulk/BulkServiceValidation.java
src/test/java/com/autotune/common/bulk/BulkServiceValidationTest.java
tests/scripts/local_monitoring_tests/rest_apis/test_bulkAPI.py
Introduce datasource-to-cluster association and expose it through API and persistence.
  • Extend KruizeDataSourceEntry with a clusters text column (plus list helpers) and add DB migration to add the column
  • Propagate clusters between DB and DataSourceInfo via DBHelpers and extend DataSourceInfo with a clusters list and accessor
  • Parse clusters arrays from datasource config JSON in DataSourceCollection and populate DataSourceInfo
  • Register a custom Gson DataSourceInfoAdapter and ensure ListDatasources serializes non-empty clusters while omitting empty ones
src/main/java/com/autotune/database/table/KruizeDataSourceEntry.java
migrations/lm/v111__add_cluster_to_datasources.sql
src/main/java/com/autotune/common/datasource/DataSourceInfo.java
src/main/java/com/autotune/common/datasource/DataSourceCollection.java
src/main/java/com/autotune/database/helper/DBHelpers.java
src/main/java/com/autotune/analyzer/services/ListDatasources.java
src/main/java/com/autotune/analyzer/adapters/DataSourceInfoAdapter.java
src/test/java/com/autotune/database/table/KruizeDataSourceEntryTest.java
Enhance recommendation notifications to indicate CPU/Memory under/over-provisioning and document notification codes.
  • Add new NOTICE_* notification codes and messages for CPU_REQUESTS_UNDER/OVER_PROVISIONED and MEMORY_REQUESTS_UNDER/OVER_PROVISIONED
  • Update RecommendationEngine to emit the new notifications when threshold is exceeded and direction of change indicates under vs over provisioning
  • Extend NotificationCodes.md table and scope matrix to include new codes
src/main/java/com/autotune/analyzer/recommendations/RecommendationConstants.java
src/main/java/com/autotune/analyzer/recommendations/engine/RecommendationEngine.java
design/NotificationCodes.md
Update design docs to describe new Bulk API fields and datasource cluster configuration semantics.
  • Document Bulk API cluster_name, model_settings, and term_settings fields, examples, and behavior in BulkAPI.md
  • Document datasource.cluster configuration, behavior, and use cases, and update schema table and metadata examples in KruizeDatasource.md
design/BulkAPI.md
design/KruizeDatasource.md

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 5 issues, and left some high level feedback:

  • When parsing and storing comma-separated cluster names in KruizeDataSourceEntry.getClusterList() and related code paths, consider trimming each token after split(",") so that configurations like "default, stage" don’t leave leading spaces in cluster identifiers.
  • There is a potential naming mismatch between the design (datasources[].cluster) and the implementation (DataSourceMetadataInfoJSONKeys.CLUSTERS and clusters field on DataSourceInfo/DB); please ensure the JSON key used in datasource configs is consistent and that the code reads the same key to avoid silent misconfiguration.
  • In DataSourceInfoAdapter, you serialize the authentication field as authenticationConfig, while the public design docs reference authentication; consider renaming this JSON property (or adding a backwards-compatible alias) so that the ListDatasources API aligns with the documented schema.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- When parsing and storing comma-separated cluster names in `KruizeDataSourceEntry.getClusterList()` and related code paths, consider trimming each token after `split(",")` so that configurations like `"default, stage"` don’t leave leading spaces in cluster identifiers.
- There is a potential naming mismatch between the design (`datasources[].cluster`) and the implementation (`DataSourceMetadataInfoJSONKeys.CLUSTERS` and `clusters` field on `DataSourceInfo`/DB); please ensure the JSON key used in datasource configs is consistent and that the code reads the same key to avoid silent misconfiguration.
- In `DataSourceInfoAdapter`, you serialize the authentication field as `authenticationConfig`, while the public design docs reference `authentication`; consider renaming this JSON property (or adding a backwards-compatible alias) so that the ListDatasources API aligns with the documented schema.

## Individual Comments

### Comment 1
<location path="src/main/java/com/autotune/common/bulk/BulkServiceValidation.java" line_range="51-60" />
<code_context>
+    
+    // Kubernetes DNS-1123 subdomain validation constants
+    private static final int MAX_CLUSTER_NAME_LENGTH = 253;
+    private static final String CLUSTER_NAME_REGEX = "^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$";
+    private static final String CLUSTER_NAME_FORMAT_ERROR =
+            "Invalid cluster_name format. Must be a valid Kubernetes DNS-1123 subdomain: " +
</code_context>
<issue_to_address>
**issue (bug_risk):** Cluster name validation does not enforce the DNS-1123 63-char per-label limit despite the comment/spec reference.

Current validation only checks a 253-char max and the general DNS-1123 subdomain pattern, but not the 63-character per-label limit that Kubernetes enforces. Because the error message and constant claim compliance with the Kubernetes DNS-1123 subdomain spec, this may permit cluster names that Kubernetes will later reject. Please either enforce the 63-char per-label constraint (e.g., via a stricter regex or an additional length check) or relax the wording of the error message to match the actual behavior.
</issue_to_address>

### Comment 2
<location path="src/main/java/com/autotune/database/table/KruizeDataSourceEntry.java" line_range="126" />
<code_context>
+        if (clusters == null || clusters.trim().isEmpty()) {
+            return new ArrayList<>();
+        }
+        return new ArrayList<>(Arrays.asList(clusters.split(",")));
+    }
+
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Split cluster list entries are not trimmed, which may cause subtle matching issues if config uses spaces after commas.

For inputs like "cluster-a, cluster-b", this will return "cluster-a" and " cluster-b" (with a leading space), which can break exact string comparisons. Consider trimming each split element and filtering out any empty strings produced by patterns like ",," before returning the list.

Suggested implementation:

```java
    /**
     * Get cluster names as a list
     * @return List of cluster names, empty list if no clusters defined
     */
    public List<String> getClusterList() {
        if (clusters == null || clusters.trim().isEmpty()) {
            return new ArrayList<>();
        }
        return Arrays.stream(clusters.split(","))
                .map(String::trim)
                .filter(s -> !s.isEmpty())
                .collect(Collectors.toList());
    }

import jakarta.persistence.*;

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

```

1. Ensure the `import` section is at the top of the file (above the class declaration) and not inside any method body. In the provided snippet, the `import` lines appear directly after the method, which is not valid Java syntax. Move all `import` statements to the correct place if they are not already there in the full file.
2. If there are existing static imports or other stream-related imports, keep the style consistent (e.g., if `Collectors.toList()` is statically imported elsewhere, you might want to follow that convention).
</issue_to_address>

### Comment 3
<location path="src/main/java/com/autotune/common/datasource/DataSourceCollection.java" line_range="204-214" />
<code_context>
+                if (dataSourceObject.has(KruizeConstants.DataSourceConstants.DataSourceMetadataInfoJSONKeys.CLUSTERS)) {
+                    JSONArray clustersArray = dataSourceObject.optJSONArray(KruizeConstants.DataSourceConstants.DataSourceMetadataInfoJSONKeys.CLUSTERS);
+                    if (clustersArray != null) {
+                        for (int i = 0; i < clustersArray.length(); i++) {
+                            clusters.add(clustersArray.getString(i));
+                        }
+                        LOGGER.info("Clusters for datasource {}: {}", name, clusters);
</code_context>
<issue_to_address>
**suggestion:** Cluster names from config are accepted verbatim without trimming or basic validation.

Since the JSON values are added directly, entries like " cluster-a" or "" can end up in `clusters`. To avoid dirty data and downstream issues, trim each value before adding it and skip empty results (and optionally log or reject clearly invalid names).

```suggestion
                // Extract clusters array if present
                List<String> clusters = new ArrayList<>();
                if (dataSourceObject.has(KruizeConstants.DataSourceConstants.DataSourceMetadataInfoJSONKeys.CLUSTERS)) {
                    JSONArray clustersArray = dataSourceObject.optJSONArray(KruizeConstants.DataSourceConstants.DataSourceMetadataInfoJSONKeys.CLUSTERS);
                    if (clustersArray != null) {
                        for (int i = 0; i < clustersArray.length(); i++) {
                            String clusterName = clustersArray.optString(i, null);
                            if (clusterName == null) {
                                LOGGER.warn("Null cluster name encountered for datasource '{}', index {}. Skipping entry.", name, i);
                                continue;
                            }
                            String trimmedClusterName = clusterName.trim();
                            if (trimmedClusterName.isEmpty()) {
                                LOGGER.warn("Blank or whitespace-only cluster name encountered for datasource '{}', index {}. Original value: '{}'. Skipping entry.", name, i, clusterName);
                                continue;
                            }
                            clusters.add(trimmedClusterName);
                        }
                        LOGGER.info("Clusters for datasource {}: {}", name, clusters);
                    }
                }
```
</issue_to_address>

### Comment 4
<location path="src/main/java/com/autotune/analyzer/adapters/DataSourceInfoAdapter.java" line_range="39-41" />
<code_context>
+        
+        jsonObject.addProperty("name", src.getName());
+        jsonObject.addProperty("provider", src.getProvider());
+        jsonObject.addProperty("serviceName", src.getServiceName());
+        jsonObject.addProperty("namespace", src.getNamespace());
+        jsonObject.addProperty("url", src.getUrl() != null ? src.getUrl().toString() : null);
+        
+        // Serialize authenticationConfig
</code_context>
<issue_to_address>
**issue (bug_risk):** The custom serializer will emit explicit nulls for fields that were previously omitted, which may be a breaking change for consumers.

`addProperty` with null produces explicit `"serviceName": null`, `"namespace": null`, and `"url": null` instead of omitting them, unlike the default Gson behavior without `serializeNulls`. If clients depend on these fields being absent rather than null, this changes the wire format. To keep the previous behavior, only add these properties when they are non-null, as you already do for `authenticationConfig` and `clusters`.
</issue_to_address>

### Comment 5
<location path="design/KruizeDatasource.md" line_range="34" />
<code_context>
+| `datasources[].provider`                         | string | Yes      | Provider type (e.g., "prometheus")                     |
+| `datasources[].url` OR `serviceName`+`namespace` | string | Yes      | Connection endpoint                                    |
+| `datasources[].authentication`                   | object | No       | Authentication configuration (defaults to "none")      |
+| `datasources[].cluster`                          | array  | No       | Array of cluster names associated with this datasource |

 ### URL vs ServiceName
</code_context>
<issue_to_address>
**issue:** Clarify singular `cluster` vs plural `clusters` field name in examples

The table documents `datasources[].cluster` while the later API response example uses `"clusters": ["default"]`. Please standardize on a single field name (`cluster` vs `clusters`) across the schema, examples, and response to avoid ambiguity about which is correct.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread src/main/java/com/autotune/common/bulk/BulkServiceValidation.java Outdated
Comment thread src/main/java/com/autotune/database/table/KruizeDataSourceEntry.java Outdated
Comment thread design/KruizeDatasource.md Outdated
khansaad added 2 commits June 18, 2026 20:37
Signed-off-by: Saad Khan <saakhan@ibm.com>
Signed-off-by: Saad Khan <saakhan@ibm.com>
@khansaad khansaad force-pushed the feature/bulk-api-cluster-settings-clean branch 2 times, most recently from 9a9b385 to 12982bf Compare June 19, 2026 12:17
Signed-off-by: Saad Khan <saakhan@ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

1 participant