-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strange Behaviour on RepartitionExec with CoalescePartitionsExec. #5278
Comments
Having a quick look an the plan: the repartition will partition into a single partition (because you only have a single unique key) which is likely not the first partition. The stream for the first partition will only advance when Edit 1: |
I thought so. If you need assistance, I could provide it. |
Does it make any difference if this is |
I will take a look. |
It does. I did not put the |
I think the issue is your unbounded stream: it never yields back to tokio and hence the tasks that receive the data will never be executed (even though they are scheduled), see https://tokio.rs/blog/2020-04-preemption and https://docs.rs/tokio/latest/tokio/task/fn.yield_now.html. I'm pretty sure the repartition channels are correct. I've added a bunch of printlns and they wake the receivers but the receiver never is executed by tokio. |
This BTW will an issue with ALL CPU-bound tasks in tokio/DataFusion. You MUST yield to tokio from time to time to keep the system stable. |
~ Your test (implicitly) uses a single threaded tokio executor (so there is only a single thread). With only a single thread, as @crepererum mentions you need to yield control back to the scheduler (which is what
I take it back -- I see your example has a multithreaded executor |
I think we could make this issue less likely by inserting a yield point into the distributor channels. Let me draft a PR... |
This prevents endless spinning and locked up tokio tasks if the inputs never yield `pending`. Fixes apache#5278.
This prevents endless spinning and locked up tokio tasks if the inputs never yield `pending`. Fixes apache#5278.
This prevents endless spinning and locked up tokio tasks if the inputs never yield `pending`. Fixes apache#5278.
This prevents endless spinning and locked up tokio tasks if the inputs never yield `pending`. Fixes apache#5278.
* fix: add yield point to `RepartitionExec` This prevents endless spinning and locked up tokio tasks if the inputs never yield `pending`. Fixes #5278. * refactor: use a single `UnboundedExec` for testing * refactor: rename test
* fix: add yield point to `RepartitionExec` This prevents endless spinning and locked up tokio tasks if the inputs never yield `pending`. Fixes apache#5278. * refactor: use a single `UnboundedExec` for testing * refactor: rename test
Describe the bug
I am working with
plan. Using the physical plan with
CoalescePartitionsExec
andRepartitionExec
causes strange behavior when providing a stream with only one unique value.The strange behavior:
The issue with the
CoalescePartitionsExec
andRepartitionExec
physical plan is that when a stream with only one unique value is provided, no data is read from theRepartitionExec
until the stream is exhausted. Even callingwake_receivers()
does not wake up theDistributionReceiver
. This behavior is not observed withoutCoalescePartitionsExec
.If I use more unique values, there is no blocking problem. This problem occurs if we put 1 unique value.
Plans with blocking repartition:
Plan without blocking (
plan.execute(2, task)
), and this can change according to hash value.) :To Reproduce
datafusion/core/tests/repartition_exec_blocks.rs
Expected behavior
wake_receivers()
call.Additional context
Add any other context about the problem here.
cc @crepererum @alamb @tustvold
The text was updated successfully, but these errors were encountered: