Stop duplicating per-segment work across segment partitions #13745

javanna · 2024-09-09T18:27:50Z

With the introduction of intra-segment concurrency support (see #13542), the IndexSearcher is able to parallelize search across partitions of segments. Some of our queries, like PointRangeQuery, perform per-segment computation ahead of time, that is currently not shared across segment partitions, but rather duplicated across them.

This makes intra-segment concurrency not appealing for those queries, which we should look at addressing. If we can share such computation across segment partitions, we can likely enable intra-segment slicing by default and ensure that intra-segment concurrency is beneficial for all queries.

The text was updated successfully, but these errors were encountered:

This commit introduces support for optionally creating slices that target leaf reader context partitions, which allow them to be searched concurrently. This is good to maximize resource usage when searching force-merged indices, or indices with rather big segments, by parallelizig search execution across subsets of segments being searched. Note: this commit does not affect default generation of slices. Segments can be partitioned by overriding the `IndexSearcher#slices(List<LeafReaderContext>)` method to plug in ad-hoc slices creation. Moreover, the existing `IndexSearcher#slices` static method now creates segment partitions when the additional `allowSegmentsPartitions` argument is set to `true`. The overall design of this change is based on the existing search concurrency support that is based on `LeafSlice` and `CollectorManager`. A new `LeafReaderContextPartition` abstraction is introduced, that holds a reference to a `LeafReaderContext` and the range of doc ids it targets. A `LeafSlice` noew targets segment partitions, each identified by a `LeafReaderContext` instance and a range of doc ids. It is possible for a partition to target a whole segment, and for partitions of different segments to be combined into the same leaf slices freely, hence searched by the same thread. It is not possible for multiple partitions of the same segment to be added to the same leaf slice. Segment partitions are searched concurrently leveraging the existing `BulkScorer#score(LeafCollector collector, Bits acceptDocs, int min, int max)` method, that allows to score a specific subset of documents for a provided `LeafCollector`, in place of the `BulkScorer#score(LeafCollector collector, Bits acceptDocs)` that would instead score all documents. ## Changes that require migration The migrate guide has the following new clarifying items around the contract and breaking changes required to support intra-segment concurrency: - `Collector#getLeafCollector` may be called multiple times for the same leaf across distinct `Collector` instances created by a `CollectorManager`. Logic that relies on `getLeafCollector` being called once per leaf per search needs updating. - a `Scorer`, `ScorerSupplier` or `BulkScorer` may be requested multiple times for the same leaf - `IndexSearcher#searchLeaf` change of signature to accept the range of doc ids - `BulkScorer#score(LeafCollector, BitSet)` is removed in favour of `BulkScorer#score(LeafCollector, BitSet, int, int)` - static `IndexSearcher#slices` method changed to take a last boolean argument that optionally enables the creation of segment partitions - `TotalHitCountCollectorManager` now requires that an array of `LeafSlice`s, retrieved via `IndexSearcher#getSlices`, is provided to its constructor Note: `DrillSideways` is the only component that does not support intra-segment concurrency and needs considerable work to do so, due to its requirement that the entire set of docs in a segment gets scored in one go. The default searcher slicing is not affected by this PR, but `LuceneTestCase` now randomly leverages intra-segment concurrency. An additional `newSearcher` method is added that takes a `Concurrency` enum as the last argument in place of the `useThreads` boolean flag. This is important to disable intra-segment concurrency for `DrillSideways` related tests that do support inter-segment concurrency but not intra-segment concurrency. ## Next step While this change introduces support for intra-segment concurrency, it only sets up the foundations of it. There is still a performance penalty for queries that require segment-level computation ahead of time, such as points/range queries. This is an implementation limitation that we expect to improve in future releases, see #13745. Additionally, we will need to decide what to do about the lack of support for intra-segment concurrency in `DrillSideways` before we can enable intra-segment slicing by default. See #13753 . Closes #9721

javanna · 2025-02-25T15:55:33Z

This is a heads up that I started working on this. My focus is currently on PointRangeQuery. The overall goal is to share the bitset computation across scorer suppliers for the same leaf context. We would still create one scorer supplier per segment partition, and a new doc id set iterator, but all the rest would be shared across different partitions of the same segment.

Focusing on a single query is quite a simplification, which allows me to better understand the problem I am trying to solve. I do wonder how many other queries need this kind of treatment. I know there have been talks in the past about cloning scorers and such, but I wonder if that can be avoided by adapting the queries / weight impls that require attention instead.

Could anyone help me come up with a list of queries that need to be modified to de-duplicate per-segment work in order to optimize their support for intra-segment concurrency? Do we actually need a very generic approach for this problem, or is it reasonable to go case by case perhaps?

msokolov · 2025-03-04T18:32:03Z

HNSW vector search heavy lifting is done in rewrite, so out of scope for this, right? Maybe multi-term queries would need to do some work. What about join queries? TermInSet query?

javanna · 2025-03-04T23:04:08Z

HNSW vector search heavy lifting is done in rewrite, so out of scope for this, right?

I believe so, mostly because query rewrite does not parallelize on slices, but across segments directly. See #13952 .

Maybe multi-term queries would need to do some work. What about join queries? TermInSet query?

Yep, TermInSet was on my radar. The others you mentioned, I will take a look at. Thanks!

javanna added the type:enhancement label Sep 9, 2024

This was referenced Sep 9, 2024

Add support for intra-segment search concurrency #13542

Merged

DrillSideways does not support intra-segment concurrency #13753

Open

javanna changed the title ~~Stop duplicate per-segment work across segment partitions~~ Stop duplicating per-segment work across segment partitions Sep 12, 2024

javanna referenced this issue in elastic/elasticsearch Sep 12, 2024

Fix compilation issues after last Lucene 10 snapshot merge

0ee5e4f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop duplicating per-segment work across segment partitions #13745

Stop duplicating per-segment work across segment partitions #13745

javanna commented Sep 9, 2024 •

edited

Loading

javanna commented Feb 25, 2025

msokolov commented Mar 4, 2025

javanna commented Mar 4, 2025

Stop duplicating per-segment work across segment partitions #13745

Stop duplicating per-segment work across segment partitions #13745

Comments

javanna commented Sep 9, 2024 • edited Loading

javanna commented Feb 25, 2025

msokolov commented Mar 4, 2025

javanna commented Mar 4, 2025

javanna commented Sep 9, 2024 •

edited

Loading