-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minor: fix some new warnings, fix force_hash_collisions
flag propagation + comment out some tests
#11467
Conversation
Signed-off-by: Nick Cameron <nrc@ncameron.org>
So, this has got more interesting than I predicted. Since |
@@ -1930,6 +1931,8 @@ mod tests { | |||
Ok(()) | |||
} | |||
|
|||
// FIXME(#TODO) test fails with feature `force_hash_collisions` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @nrc -- I agree that cfg'ing these out is the right step for this PR
I'll file a ticket to look into the failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was writing up a ticket and testing locally. When I removed the cfgs
diff --git a/datafusion/physical-plan/src/joins/hash_join.rs b/datafusion/physical-plan/src/joins/hash_join.rs
index fdf062b1f..359a86856 100644
--- a/datafusion/physical-plan/src/joins/hash_join.rs
+++ b/datafusion/physical-plan/src/joins/hash_join.rs
@@ -1583,7 +1583,7 @@ mod tests {
use rstest::*;
use rstest_reuse::*;
- #[cfg(not(feature = "force_hash_collisions"))]
+
fn div_ceil(a: usize, b: usize) -> usize {
(a + b - 1) / b
}
@@ -1932,7 +1932,6 @@ mod tests {
}
// FIXME(#TODO) test fails with feature `force_hash_collisions`
- #[cfg(not(feature = "force_hash_collisions"))]
#[apply(batch_sizes)]
#[tokio::test]
async fn join_inner_two(batch_size: usize) -> Result<()> {
@@ -1989,7 +1988,6 @@ mod tests {
/// Test where the left has 2 parts, the right with 1 part => 1 part
// FIXME(#TODO) test fails with feature `force_hash_collisions`
- #[cfg(not(feature = "force_hash_collisions"))]
#[apply(batch_sizes)]
#[tokio::test]
async fn join_inner_one_two_parts_left(batch_size: usize) -> Result<()> {
@@ -2103,7 +2101,6 @@ mod tests {
/// Test where the left has 1 part, the right has 2 parts => 2 parts
// FIXME(#TODO) test fails with feature `force_hash_collisions`
- #[cfg(not(feature = "force_hash_collisions"))]
#[apply(batch_sizes)]
#[tokio::test]
async fn join_inner_one_two_parts_right(batch_size: usize) -> Result<()> {
And then ran the tests they seemed to pass for me locally:
andrewlamb@Andrews-MacBook-Pro-2:~/Software/datafusion$ cargo test --lib --features=force_hash_collisions -p datafusion-physical-plan -- join_inner
warning: /Users/andrewlamb/Software/datafusion/Cargo.toml: unused manifest key: workspace.lints.rust.unexpected_cfgs.check-cfg
Compiling datafusion-physical-plan v40.0.0 (/Users/andrewlamb/Software/datafusion/datafusion/physical-plan)
Finished `test` profile [unoptimized + debuginfo] target(s) in 3.59s
Running unittests src/lib.rs (target/debug/deps/datafusion_physical_plan-8b4d4245d6c07ebc)
running 40 tests
test joins::hash_join::tests::join_inner_one_no_shared_column_names ... ok
test joins::hash_join::tests::join_inner_one::batch_size_3_5 ... ok
test joins::hash_join::tests::join_inner_one_two_parts_left::batch_size_4_2 ... ok
test joins::hash_join::tests::join_inner_one::batch_size_2_10 ... ok
test joins::hash_join::tests::join_inner_one::batch_size_5_1 ... ok
test joins::hash_join::tests::join_inner_one::batch_size_1_8192 ... ok
test joins::hash_join::tests::join_inner_one_two_parts_left::batch_size_5_1 ... ok
test joins::hash_join::tests::join_inner_one_two_parts_left::batch_size_2_10 ... ok
test joins::hash_join::tests::join_inner_one_two_parts_left::batch_size_3_5 ... ok
test joins::hash_join::tests::join_inner_one::batch_size_4_2 ... ok
test joins::hash_join::tests::join_inner_one_two_parts_left_randomly_ordered ... ok
test joins::hash_join::tests::join_inner_one_two_parts_left::batch_size_1_8192 ... ok
test joins::hash_join::tests::join_inner_one_randomly_ordered ... ok
test joins::hash_join::tests::join_inner_one_two_parts_right::batch_size_2_10 ... ok
test joins::hash_join::tests::join_inner_one_two_parts_right::batch_size_1_8192 ... ok
test joins::hash_join::tests::join_inner_one_two_parts_right::batch_size_3_5 ... ok
test joins::hash_join::tests::join_inner_two::batch_size_1_8192 ... ok
test joins::hash_join::tests::join_inner_two::batch_size_4_2 ... ok
test joins::hash_join::tests::join_inner_two::batch_size_2_10 ... ok
test joins::hash_join::tests::join_inner_two::batch_size_3_5 ... ok
test joins::hash_join::tests::join_inner_two::batch_size_5_1 ... ok
test joins::hash_join::tests::join_inner_one_two_parts_right::batch_size_4_2 ... ok
test joins::hash_join::tests::join_inner_one_two_parts_right::batch_size_5_1 ... ok
test joins::hash_join::tests::join_inner_with_filter::batch_size_3_5 ... ok
test joins::hash_join::tests::join_inner_with_filter::batch_size_1_8192 ... ok
test joins::hash_join::tests::join_inner_with_filter::batch_size_2_10 ... ok
test joins::hash_join::tests::join_inner_with_filter::batch_size_5_1 ... ok
test joins::hash_join::tests::join_inner_with_filter::batch_size_4_2 ... ok
test joins::sort_merge_join::tests::join_inner_two ... ok
test joins::sort_merge_join::tests::join_inner_one ... ok
test joins::sort_merge_join::tests::join_inner_two_two ... ok
test joins::sort_merge_join::tests::join_inner_output_two_batches ... ok
test joins::hash_join::tests::partitioned_join_inner_one::batch_size_2_10 ... ok
test joins::sort_merge_join::tests::join_inner_with_nulls_with_options ... ok
test joins::sort_merge_join::tests::join_inner_with_nulls ... ok
test joins::hash_join::tests::partitioned_join_inner_one::batch_size_5_1 ... ok
test joins::hash_join::tests::partitioned_join_inner_one::batch_size_4_2 ... ok
test joins::hash_join::tests::partitioned_join_inner_one::batch_size_1_8192 ... ok
test joins::hash_join::tests::partitioned_join_inner_one::batch_size_3_5 ... ok
test joins::nested_loop_join::tests::join_inner_with_filter ... ok
test result: ok. 40 passed; 0 failed; 0 ignored; 0 measured; 688 filtered out; finished in 0.03s
@@ -238,16 +238,17 @@ SELECT * FROM t1 FULL JOIN t2 ON t1_id = t2_id | |||
44 d 4 44 x 3 | |||
NULL NULL NULL 55 w 3 | |||
|
|||
# FIXME(#TODO) fails with feature `force_hash_collisions` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the issue with SMJ with hash collisions?
I plan to investigate / file tickets for why the sqllogictests are failing before merging this PR -- I hope to do so today |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again @nrc -- I did some local testing and I couldn't get the tests to fail for me locally. Are you sure they are failing if when run with force_hash_collisions
? Maybe it was something about my machine and we should try it on CI 🤔
@@ -1930,6 +1931,8 @@ mod tests { | |||
Ok(()) | |||
} | |||
|
|||
// FIXME(#TODO) test fails with feature `force_hash_collisions` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was writing up a ticket and testing locally. When I removed the cfgs
diff --git a/datafusion/physical-plan/src/joins/hash_join.rs b/datafusion/physical-plan/src/joins/hash_join.rs
index fdf062b1f..359a86856 100644
--- a/datafusion/physical-plan/src/joins/hash_join.rs
+++ b/datafusion/physical-plan/src/joins/hash_join.rs
@@ -1583,7 +1583,7 @@ mod tests {
use rstest::*;
use rstest_reuse::*;
- #[cfg(not(feature = "force_hash_collisions"))]
+
fn div_ceil(a: usize, b: usize) -> usize {
(a + b - 1) / b
}
@@ -1932,7 +1932,6 @@ mod tests {
}
// FIXME(#TODO) test fails with feature `force_hash_collisions`
- #[cfg(not(feature = "force_hash_collisions"))]
#[apply(batch_sizes)]
#[tokio::test]
async fn join_inner_two(batch_size: usize) -> Result<()> {
@@ -1989,7 +1988,6 @@ mod tests {
/// Test where the left has 2 parts, the right with 1 part => 1 part
// FIXME(#TODO) test fails with feature `force_hash_collisions`
- #[cfg(not(feature = "force_hash_collisions"))]
#[apply(batch_sizes)]
#[tokio::test]
async fn join_inner_one_two_parts_left(batch_size: usize) -> Result<()> {
@@ -2103,7 +2101,6 @@ mod tests {
/// Test where the left has 1 part, the right has 2 parts => 2 parts
// FIXME(#TODO) test fails with feature `force_hash_collisions`
- #[cfg(not(feature = "force_hash_collisions"))]
#[apply(batch_sizes)]
#[tokio::test]
async fn join_inner_one_two_parts_right(batch_size: usize) -> Result<()> {
And then ran the tests they seemed to pass for me locally:
andrewlamb@Andrews-MacBook-Pro-2:~/Software/datafusion$ cargo test --lib --features=force_hash_collisions -p datafusion-physical-plan -- join_inner
warning: /Users/andrewlamb/Software/datafusion/Cargo.toml: unused manifest key: workspace.lints.rust.unexpected_cfgs.check-cfg
Compiling datafusion-physical-plan v40.0.0 (/Users/andrewlamb/Software/datafusion/datafusion/physical-plan)
Finished `test` profile [unoptimized + debuginfo] target(s) in 3.59s
Running unittests src/lib.rs (target/debug/deps/datafusion_physical_plan-8b4d4245d6c07ebc)
running 40 tests
test joins::hash_join::tests::join_inner_one_no_shared_column_names ... ok
test joins::hash_join::tests::join_inner_one::batch_size_3_5 ... ok
test joins::hash_join::tests::join_inner_one_two_parts_left::batch_size_4_2 ... ok
test joins::hash_join::tests::join_inner_one::batch_size_2_10 ... ok
test joins::hash_join::tests::join_inner_one::batch_size_5_1 ... ok
test joins::hash_join::tests::join_inner_one::batch_size_1_8192 ... ok
test joins::hash_join::tests::join_inner_one_two_parts_left::batch_size_5_1 ... ok
test joins::hash_join::tests::join_inner_one_two_parts_left::batch_size_2_10 ... ok
test joins::hash_join::tests::join_inner_one_two_parts_left::batch_size_3_5 ... ok
test joins::hash_join::tests::join_inner_one::batch_size_4_2 ... ok
test joins::hash_join::tests::join_inner_one_two_parts_left_randomly_ordered ... ok
test joins::hash_join::tests::join_inner_one_two_parts_left::batch_size_1_8192 ... ok
test joins::hash_join::tests::join_inner_one_randomly_ordered ... ok
test joins::hash_join::tests::join_inner_one_two_parts_right::batch_size_2_10 ... ok
test joins::hash_join::tests::join_inner_one_two_parts_right::batch_size_1_8192 ... ok
test joins::hash_join::tests::join_inner_one_two_parts_right::batch_size_3_5 ... ok
test joins::hash_join::tests::join_inner_two::batch_size_1_8192 ... ok
test joins::hash_join::tests::join_inner_two::batch_size_4_2 ... ok
test joins::hash_join::tests::join_inner_two::batch_size_2_10 ... ok
test joins::hash_join::tests::join_inner_two::batch_size_3_5 ... ok
test joins::hash_join::tests::join_inner_two::batch_size_5_1 ... ok
test joins::hash_join::tests::join_inner_one_two_parts_right::batch_size_4_2 ... ok
test joins::hash_join::tests::join_inner_one_two_parts_right::batch_size_5_1 ... ok
test joins::hash_join::tests::join_inner_with_filter::batch_size_3_5 ... ok
test joins::hash_join::tests::join_inner_with_filter::batch_size_1_8192 ... ok
test joins::hash_join::tests::join_inner_with_filter::batch_size_2_10 ... ok
test joins::hash_join::tests::join_inner_with_filter::batch_size_5_1 ... ok
test joins::hash_join::tests::join_inner_with_filter::batch_size_4_2 ... ok
test joins::sort_merge_join::tests::join_inner_two ... ok
test joins::sort_merge_join::tests::join_inner_one ... ok
test joins::sort_merge_join::tests::join_inner_two_two ... ok
test joins::sort_merge_join::tests::join_inner_output_two_batches ... ok
test joins::hash_join::tests::partitioned_join_inner_one::batch_size_2_10 ... ok
test joins::sort_merge_join::tests::join_inner_with_nulls_with_options ... ok
test joins::sort_merge_join::tests::join_inner_with_nulls ... ok
test joins::hash_join::tests::partitioned_join_inner_one::batch_size_5_1 ... ok
test joins::hash_join::tests::partitioned_join_inner_one::batch_size_4_2 ... ok
test joins::hash_join::tests::partitioned_join_inner_one::batch_size_1_8192 ... ok
test joins::hash_join::tests::partitioned_join_inner_one::batch_size_3_5 ... ok
test joins::nested_loop_join::tests::join_inner_with_filter ... ok
test result: ok. 40 passed; 0 failed; 0 ignored; 0 measured; 688 filtered out; finished in 0.03s
force_hash_collisions
flag propagation + comment out some tests
@alamb thanks for reviewing and looking into this! This has got weirder and I may be doing something fundamentally wrong, but here's my investigation. For reference, I'm running on Linux AMD64, plenty of RAM and disk space. All these results are after only removing the Running I have no idea what is going on, but unless you have an idea, I can try and find the cause next week. |
I think there are two possibilities:
Given the tests pass for some machines I am inclined to believe it is the first, but I don't really know for sure without being able to see what the test failures are (can you provide some the output of the failed tests?) |
It appears we ran out of time on this one and the new clippy lint was released on stable: #11651 |
@@ -152,4 +152,5 @@ rpath = false | |||
large_futures = "warn" | |||
|
|||
[workspace.lints.rust] | |||
unexpected_cfgs = { level = "warn", check-cfg = ["cfg(tarpaulin)"] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this related to force_hash_collisions change, or something separate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it allows tarpaulin
to be used as a feature
even if the crate doesn't define it
I rebased this commit (conflicts) and included it with other fixes in #11654. |
Fixes some warnings when compiling with nightly which will soon be warnings on stable too. These are multiple
unexpected
cfgcondition value:
force_hash_collisions`` and onefor loop over a
&Option`. This is more readably written as an `if let` statement`Which issue does this PR close?
NA
Rationale for this change
Preserve clean compiles
What changes are included in this PR?
Just the warning fixes
Are these changes tested?
Covered by existing tests
Are there any user-facing changes?
No