Default ESQL data partitioning to DOC #99545

dnhatn · 2023-09-13T16:13:33Z

The DOC data partitioning typically outperforms SEGMENT data partitioning. Since we've capped the number of concurrent operators to the number of threads in the ESQL worker as a safeguard, we should consider changing the default data partitioning from SEGMENT to DOC.

Relates #99189

elasticsearchmachine · 2023-09-13T16:13:57Z

Pinging @elastic/es-ql (Team:QL)

elasticsearchmachine · 2023-09-13T16:13:58Z

Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL)

dnhatn · 2023-09-13T16:35:08Z

Thanks, Nik!

ChrisHegarty

LGTM

This reverts commit b05113e.

…ning

jpountz · 2023-09-13T19:33:42Z

I haven't run benchmarks to confirm, but in theory this change makes latency potentially better but throughput worse until we add support for doc partitioning to Lucene: apache/lucene#9721. This is because some queries like range queries have no option but to evaluate the filter across the entire segment anyway. If you have a segment that you want to partition in, say, 10 partitions and evaluate a range filter across 10 threads (one per partition), each thread will actually evaluate the filter against the entire segment - not what we want.

Given how range queries on the @timestamp field are common, my preference would be to wait until we address this Lucene issue before we enable doc partitioning by default in ES|QL.

dnhatn · 2023-09-13T19:44:19Z

@jpountz Thank you for the feedback. I will take a look at the issue that you linked and leave this PR unmerged as you suggested.

costin

LGTM

elasticsearchmachine · 2024-01-02T19:53:05Z

Pinging @elastic/es-analytics-geo (Team:Analytics)

Default ESQL data partitioning to DOC

604f299

dnhatn added >non-issue :Analytics/ES|QL AKA ESQL v8.11.0 labels Sep 13, 2023

dnhatn requested review from costin, nik9000 and ChrisHegarty September 13, 2023 16:13

elasticsearchmachine added the Team:QL (Deprecated) Meta label for query languages team label Sep 13, 2023

nik9000 approved these changes Sep 13, 2023

View reviewed changes

dnhatn added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Sep 13, 2023

ChrisHegarty approved these changes Sep 13, 2023

View reviewed changes

dnhatn added 6 commits September 13, 2023 10:30

Fix casting in Column|String ExtractOperator

acb4054

Use segment in spec test

4f8d17b

Randomize pragmas

b05113e

Revert "Randomize pragmas"

b93c86c

This reverts commit b05113e.

sort

9388bd0

Merge remote-tracking branch 'elastic/main' into default-doc-partitio…

838c434

…ning

dnhatn removed the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Sep 13, 2023

costin approved these changes Sep 14, 2023

View reviewed changes

mattc58 added v8.12.0 and removed v8.11.0 labels Oct 4, 2023

brianseeders added v8.13.0 and removed v8.12.0 labels Dec 6, 2023

wchaparro added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jan 2, 2024

elasticsearchmachine removed the Team:QL (Deprecated) Meta label for query languages team label Jan 2, 2024

dnhatn closed this Jan 2, 2024

dnhatn removed >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL v8.13.0 labels Jan 2, 2024

dnhatn deleted the default-doc-partitioning branch January 2, 2024 22:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default ESQL data partitioning to DOC #99545

Default ESQL data partitioning to DOC #99545

dnhatn commented Sep 13, 2023 •

edited

Loading

elasticsearchmachine commented Sep 13, 2023

elasticsearchmachine commented Sep 13, 2023

dnhatn commented Sep 13, 2023

ChrisHegarty left a comment

jpountz commented Sep 13, 2023

dnhatn commented Sep 13, 2023

costin left a comment

elasticsearchmachine commented Jan 2, 2024

Default ESQL data partitioning to DOC #99545

Default ESQL data partitioning to DOC #99545

Conversation

dnhatn commented Sep 13, 2023 • edited Loading

elasticsearchmachine commented Sep 13, 2023

elasticsearchmachine commented Sep 13, 2023

dnhatn commented Sep 13, 2023

ChrisHegarty left a comment

Choose a reason for hiding this comment

jpountz commented Sep 13, 2023

dnhatn commented Sep 13, 2023

costin left a comment

Choose a reason for hiding this comment

elasticsearchmachine commented Jan 2, 2024

dnhatn commented Sep 13, 2023 •

edited

Loading