-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a MergePolicy wrapper that preserves search concurrency? #12877
Comments
+1, I like this idea. It might be implemented by having Perhaps when the index is tiny it doesn't do the |
I have a vague recollection of you saying you already implemented something like that, am I making this up? (it's quite possible, I struggle to keep lots of stuff in memory!)
One potential issue that comes to mind with this approach is that
+1 |
Ha! No, you are not hallucinating @jpountz! We do have something like this for Amazon product search -- it's crucial for our usage to keep long-pole query latencies low by maximizing concurrency -- but it might just be as stupid as "setMaxMergedSegmentMB" to something "just right" for our usage. I'll poke around inside and see if our impl is not too embarrassing to share ;)
Hmm you're right -- TMP would spend more time doing this "unbalanced packing", though, presumably it would sort of run out of options because it gobbles up the smallish segments aggressively, maybe? Hard to visualize... |
OK well our (Amazon product search's) implementation is sorta messy: we subclass |
Description
We have an issue about decoupling search concurrency from index geometry (#9721), but this comes with trade-offs as the per-segment bit of search is hard to parallelize. Maybe we should also introduce a merge policy wrapper that tries to preserve a search concurrency of
N
by preventing the creation of segments of more thanmaxDoc/N
docs?The text was updated successfully, but these errors were encountered: