add adaptive batch size heuristic for filtered search by yuejiaointel · Pull Request #309 · intel/ScalableVectorSearch

yuejiaointel · 2026-04-02T21:05:38Z

Currently the filtered k-NN search loop uses batch_size = k when calling iterator.next(). When the filter is restrictive (e.g., 1% of IDs pass), this results in many expensive graph traversal rounds to collect enough valid results.

This PR introduces a heuristic that adapts the batch size based on observed filter hit rate:

First round uses search_window_size as the initial batch size
After each round, computes: batch_size = remaining_needed / hit_rate
If no hits yet, keeps using the initial batch size

For example, with k=10 and a 10% filter pass rate: instead of ~100 rounds of 10 candidates, it converges in ~2 rounds.

rfsaliev

Thank you for the good proposal.

Requested changes:

Please apply such improvements to range_search() and in `vamana_index_impl.h as well

Suggestions:
There are some performance related suggestions in comments.

But during the review, I found, that: compute_filtered_batch_size() logic is prediction of further amount of processing based on previous processing results and requested amount of matches aka:
PredictFurtherProcessing(processed, hits, goal)
So, I would declare this function more generic, and move it to utilities header with more common signature and reuse in vamana_index_impl.h as well:

In such case, % max_batch_size operation should be applied outside of this function

/// @param processed - number of already processed elements (total_checked)
/// @param hits - number of matched elements (found)
/// @param goal - number of requested elements to be matched (needed)
/// @param hint - result to be returned if prediction is failed, e.g. other params == 0
size_t predict_further_processing(size_t processed, size_t hits, size_t goal, size_t hint) {
    if (processed * hits * goal == 0) {
        return hint;
    }
    // use prediction formula below
    ...
}

rfsaliev · 2026-04-03T08:26:56Z

bindings/cpp/src/dynamic_vamana_index_impl.h

@@ -136,6 +153,8 @@ class DynamicVamanaIndexImpl {
                            }
                        }
                    }
+                    batch_size =
+                        compute_filtered_batch_size(found, k, total_checked, batch_size);


Good idea, but, from performance perspective, I would slightly change the code:

Compute the batch size at the beginning of the do-while loop - it will avoid computation when found==k

Increment total_checked out-of the for loop.

It might make sense to set initial batch size the max of k and search_window_size
E.g.

Suggested change

size_t total_checked = 0;

auto batch_size = std::max(k, sp.buffer_config_.get_search_window_size());

do {

batch_size =

compute_filtered_batch_size(found, k, total_checked, batch_size);

iterator.next(batch_size);

for (auto& neighbor : iterator.results()) {

if (filter->is_member(neighbor.id())) {

result.set(neighbor, i, found);

found++;

if (found == k) {

break;

}

}

}

total_checked += iterator.size();

Thx, added these change

rfsaliev · 2026-04-03T09:10:22Z

bindings/cpp/src/dynamic_vamana_index_impl.h

+    double hit_rate = static_cast<double>(found) / total_checked;
+    return static_cast<size_t>((needed - found) / hit_rate);


I would also try to improve performance here:

FP64 computation is not very performant

Computation precision is not very important here

There is potential issues in SVS BatchIterator in case of huge batch size

So, I would use the following formula:

hit_rate_inv = 1 / hit_rate = checked / found

result = (needed - found) / hit_rate = (needed - found) * hit_rate_inv = needed * checked / found - checked

The formula needed * checked / found - checked is most precise, but there is the bigger risk of overflow for huge needed and checked values

Suggested change

double hit_rate = static_cast<double>(found) / total_checked;

return static_cast<size_t>((needed - found) / hit_rate);

auto hit_rate = total_checked / found + 1; // found == 0 is handled above; +1 to increase result eliminating INT precision issues

return (needed - found) * hit_rate % max_batch_size; // max_batch_size - constant

Alternative (assuming, that FP32 is fast enough):

Suggested change

double hit_rate = static_cast<double>(found) / total_checked;

return static_cast<size_t>((needed - found) / hit_rate);

float new_batch_size = static_cast<float>(needed) * total_checked / found - total_checked;

return static_cast<size_t>(new_batch_size) % max_batch_size;

thx added, probably need to run some benchmarks before knowing exact performance

- Rename compute_filtered_batch_size to predict_further_processing and move to svs_runtime_utils.h for reuse - Use float arithmetic instead of double for hit rate calculation - Compute batch size at loop start to avoid unnecessary computation - Use iterator.size() instead of per-element increment for total_checked - Initial batch size = max(k, search_window_size) - Apply adaptive batch size to vamana_index_impl.h filtered search

- Cap batch size with std::min instead of modulo to avoid SIGFPE - Add comments explaining adaptive batch sizing logic

rfsaliev

It seems like max_batch_size calculation issue.

rfsaliev · 2026-04-09T10:01:20Z

bindings/cpp/src/vamana_index_impl.h

+                // Use adaptive batch sizing: start with at least k candidates,
+                // then adjust based on observed filter hit rate.
+                auto batch_size = std::max(k, sp.buffer_config_.get_search_window_size());
+                const auto max_batch_size = batch_size;


IMHO, the max_batch_size value should be (compile-time?) constant based on the generic SVS Vamana performance instead of current k or search_window_size.
For example:

k == search_window_size == 10

filter->is_member() returns true for 10% of results

after first iteration, the predict_ function will return (10 - 1) * 10 / 1 == 90

but next batch_size will be limited to max_batch_size == 10

So, we will have 10 small iterations instead of 1 big enough

thx for the suggestion and agreed, the cap is too restrictive, removed the cap entirely and added a filter_stop early exist heuristic instead :
if hit rate falls below thresold (set by user) after getting some hits, we give up and return empty, and iterator should be able to handle large batch size by growing the search buffer

there are some discussions about this during the benchmark results discussion , pulled you in to the chat

- Remove max_batch_size cap that limited adaptive sizing effectiveness - Add filter_stop param to SearchParams (default 0 = never give up) - Add should_stop_filtered_search() helper in svs_runtime_utils.h - If hit rate falls below filter_stop after first round, return empty so caller can fall back to exact search

Verifies that search with filter_stop=0.5 gives up and returns unspecified results when hit rate (~10%) is below threshold.

rfsaliev

Some questions.

rfsaliev · 2026-04-13T08:07:32Z

bindings/cpp/src/dynamic_vamana_index_impl.h

-                    iterator.next(k);
+                    batch_size =
+                        predict_further_processing(total_checked, found, k, batch_size);
+                    iterator.next(batch_size);


What will happen on the second iteration in case if:

filter_stop = 0.0

batch_size = k = 100

found = 1
?
How big bach_size will be here?

batch size will be (100-1)*100/1 = 9900, with 1% hit to find remaining 99 results, we need 9900 more, is that too large?

@ibhati , can you please clarify if batch_size=9900 is suitable for Vamana BatchIterator?
Thank you.

checked with Ishwar and was told max size should be up to the number of vectors in the index, changed, thx for this question!

bindings/cpp/src/dynamic_vamana_index_impl.h

Enables early exit by default so OpenSearch can test the heuristic without plumbing a new search parameter through the stack.

Batch size can never exceed the index size since there are no more vectors to check beyond that.

Add max_batch_size parameter instead of capping at each call site.

Keep early exit opt-in only. OpenSearch can set filter_stop=0.01 when ready to test the heuristic.

rfsaliev

More comments/suggestions

rfsaliev · 2026-04-14T14:47:31Z

bindings/cpp/src/dynamic_vamana_index_impl.h

        // Selective search with IDSelector
        auto old_sp = impl_->get_search_parameters();
        impl_->set_search_parameters(sp);
+        const float filter_stop = params ? params->filter_stop : 0.0f;


It seems like this construction enforce user to always provide proper filter_stop value in case when user want to configure SearchParams.
To eliminate this issue, I would recommend to initialize filter_stop field in SearchParams with Unspecify<float> and use set_if_specified() here:

Suggested change

const float filter_stop = params ? params->filter_stop : 0.0f;

float filter_stop = svs_default_filter_stop_defined_somewhere;

if (params) {

set_if_specified(filter_stop, params->filter_stop);

}

rfsaliev · 2026-04-14T14:48:15Z

bindings/cpp/include/svs/runtime/vamana_index.h

+    // If the hit rate after the first round falls below this threshold,
+    // stop and return empty results (caller can fall back to exact search).
+    // Default 0 means never give up.
+    float filter_stop = 0.0f;


Suggested change

float filter_stop = 0.0f;

float filter_stop = Unspecify<float>();

rfsaliev · 2026-04-14T14:49:22Z

bindings/cpp/src/dynamic_vamana_index_impl.h

                size_t found = 0;
+                size_t total_checked = 0;
+                auto batch_size = std::max(k, sp.buffer_config_.get_search_window_size());
+                const auto max_batch_size = impl_->size();


This can be moved out of the search_closure

rfsaliev · 2026-04-14T14:50:02Z

bindings/cpp/src/svs_runtime_utils.h

+// If no hits yet, returns `hint` unchanged.
+// Result is capped at `max_batch_size` (e.g., number of vectors in the index).
+inline size_t predict_further_processing(
+    size_t processed, size_t hits, size_t goal, size_t hint, size_t max_batch_size


Suggested change

size_t processed, size_t hits, size_t goal, size_t hint, size_t max_batch_size

size_t processed, size_t hits, size_t goal, size_t hint, size_t max_value

thx, implemented these changes

…ax_value - Use Unspecify<float>() for filter_stop default, set_if_specified pattern - Move max_batch_size (impl size) out of search_closure - Rename max_batch_size to max_value in predict_further_processing

rfsaliev requested changes Apr 3, 2026

View reviewed changes

yuejiaointel added 5 commits April 7, 2026 11:18

add adaptive batch size heuristic for filtered search

2af16f1

use IDFilterRange instead of IDFilterSet in test

605a0be

add batch size cap and comments to adaptive filtered search

62d9bdf

- Cap batch size with std::min instead of modulo to avoid SIGFPE - Add comments explaining adaptive batch sizing logic

apply clang-format to adaptive batch size code

ee06f00

yuejiaointel force-pushed the feature/adaptive-filtered-batch-size-v2 branch 2 times, most recently from 769bcf5 to ee06f00 Compare April 7, 2026 18:20

yuejiaointel marked this pull request as ready for review April 7, 2026 21:11

yuejiaointel requested review from ahuber21, ethanglaser and ibhati as code owners April 7, 2026 21:11

yuejiaointel requested a review from rfsaliev April 8, 2026 15:39

rfsaliev reviewed Apr 9, 2026

View reviewed changes

yuejiaointel added 3 commits April 10, 2026 15:22

apply clang-format

6a10c8e

add test for filter_stop early exit heuristic

90751ca

Verifies that search with filter_stop=0.5 gives up and returns unspecified results when hit rate (~10%) is below threshold.

rfsaliev reviewed Apr 13, 2026

View reviewed changes

yuejiaointel added 4 commits April 13, 2026 13:39

set filter_stop default to 0.1 for benchmarking

fc39a16

Enables early exit by default so OpenSearch can test the heuristic without plumbing a new search parameter through the stack.

cap max batch size to number of vectors in index

9e7f26e

Batch size can never exceed the index size since there are no more vectors to check beyond that.

move max_batch_size cap into predict_further_processing

fca06e2

Add max_batch_size parameter instead of capping at each call site.

revert filter_stop default to 0 (disabled)

ad0d299

Keep early exit opt-in only. OpenSearch can set filter_stop=0.01 when ready to test the heuristic.

rfsaliev reviewed Apr 14, 2026

View reviewed changes

address review: Unspecify filter_stop, hoist max_batch_size, rename m…

657e313

…ax_value - Use Unspecify<float>() for filter_stop default, set_if_specified pattern - Move max_batch_size (impl size) out of search_closure - Rename max_batch_size to max_value in predict_further_processing

+                size_t total_checked = 0;
+                auto batch_size = std::max(k, sp.buffer_config_.get_search_window_size());
+                do {
+                    batch_size =
+                        compute_filtered_batch_size(found, k, total_checked, batch_size);
+                    iterator.next(batch_size);
+                    for (auto& neighbor : iterator.results()) {
+                        if (filter->is_member(neighbor.id())) {
+                            result.set(neighbor, i, found);
+                            found++;
+                            if (found == k) {
+                                break;
+                            }
+                        }
+                    }
+                    total_checked += iterator.size();

		double hit_rate = static_cast<double>(found) / total_checked;
		return static_cast<size_t>((needed - found) / hit_rate);

-        const float filter_stop = params ? params->filter_stop : 0.0f;
+        float filter_stop = svs_default_filter_stop_defined_somewhere;
+        if (params) {
+            set_if_specified(filter_stop, params->filter_stop);
+        }

	float filter_stop = 0.0f;
	float filter_stop = Unspecify<float>();

	size_t processed, size_t hits, size_t goal, size_t hint, size_t max_batch_size
	size_t processed, size_t hits, size_t goal, size_t hint, size_t max_value

Conversation

yuejiaointel commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rfsaliev left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rfsaliev left a comment

Choose a reason for hiding this comment

Uh oh!

rfsaliev Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yuejiaointel Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rfsaliev left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yuejiaointel Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rfsaliev left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yuejiaointel commented Apr 2, 2026 •

edited

Loading

rfsaliev Apr 9, 2026 •

edited

Loading

yuejiaointel Apr 10, 2026 •

edited

Loading

yuejiaointel Apr 13, 2026 •

edited

Loading