Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Explain for swordfish #3667

Merged
merged 5 commits into from
Jan 27, 2025
Merged

feat: Explain for swordfish #3667

merged 5 commits into from
Jan 27, 2025

Conversation

colin-ho
Copy link
Contributor

@colin-ho colin-ho commented Jan 10, 2025

Enable printing swordfish plan in .explain(show_all=True), and adds more detail in explain analyze, e.g. show the estimated stats and then the actual stats

Examples (TPCH Q2)

@github-actions github-actions bot added the feat label Jan 10, 2025
Copy link

codspeed-hq bot commented Jan 10, 2025

CodSpeed Performance Report

Merging #3667 will improve performances by 18%

Comparing colin/swordfish-explain (9eaa486) with main (bae106c)

Summary

⚡ 2 improvements
✅ 25 untouched benchmarks

Benchmarks breakdown

Benchmark main colin/swordfish-explain Change
test_count[1 Small File] 3.8 ms 3.3 ms +18%
test_iter_rows_first_row[100 Small Files] 208.1 ms 187.7 ms +10.9%

Copy link

codecov bot commented Jan 10, 2025

Codecov Report

Attention: Patch coverage is 25.92025% with 483 lines in your changes missing coverage. Please review.

Project coverage is 77.57%. Comparing base (bae106c) to head (9eaa486).
Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
src/daft-local-execution/src/sources/scan_task.rs 2.59% 75 Missing ⚠️
src/daft-local-execution/src/run.rs 0.00% 65 Missing ⚠️
...ecution/src/intermediate_ops/actor_pool_project.rs 0.00% 23 Missing ⚠️
...l-execution/src/sinks/anti_semi_hash_join_probe.rs 8.00% 23 Missing ⚠️
src/daft-local-execution/src/sources/source.rs 36.11% 23 Missing ⚠️
...aft-local-execution/src/sinks/grouped_aggregate.rs 9.52% 19 Missing ⚠️
...local-execution/src/sinks/outer_hash_join_probe.rs 9.52% 19 Missing ⚠️
src/daft-local-execution/src/sinks/sort.rs 9.52% 19 Missing ⚠️
src/daft-local-execution/src/sinks/pivot.rs 10.00% 18 Missing ⚠️
...ft-local-execution/src/intermediate_ops/project.rs 11.11% 16 Missing ⚠️
... and 25 more
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3667      +/-   ##
==========================================
- Coverage   77.91%   77.57%   -0.35%     
==========================================
  Files         727      727              
  Lines       91129    91661     +532     
==========================================
+ Hits        71007    71106      +99     
- Misses      20122    20555     +433     
Files with missing lines Coverage Δ
src/common/display/src/ascii.rs 0.00% <0.00%> (ø)
src/daft-local-execution/src/runtime_stats.rs 67.91% <0.00%> (ø)
daft/dataframe/dataframe.py 85.27% <25.00%> (-0.16%) ⬇️
...aft-local-execution/src/intermediate_ops/filter.rs 82.35% <25.00%> (-17.65%) ⬇️
src/daft-local-execution/src/sinks/concat.rs 83.33% <0.00%> (-7.58%) ⬇️
...ft-local-execution/src/sinks/cross_join_collect.rs 93.75% <25.00%> (-6.25%) ⬇️
src/daft-local-execution/src/sinks/limit.rs 93.33% <0.00%> (-6.67%) ⬇️
...execution/src/sinks/monotonically_increasing_id.rs 95.71% <0.00%> (-4.29%) ⬇️
src/daft-logical-plan/src/stats.rs 68.42% <0.00%> (-1.23%) ⬇️
daft/execution/native_executor.py 63.63% <28.57%> (-16.37%) ⬇️
... and 25 more

... and 3 files with indirect coverage changes

@@ -39,7 +39,7 @@ fn fmt_tree_indent_style<'a, W: fmt::Write + 'a>(

// Print the tree recursively, and illustrate the tree structure in the same style as `git log --graph`.
// `depth` is the number of forks in this node's ancestors.
fn fmt_tree_gitstyle<'a, W: fmt::Write + 'a>(
pub fn fmt_tree_gitstyle<'a, W: fmt::Write + 'a>(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the physical plan and logical plan don't directly use this function, instead they call plan.fmt_tree(), which is the function below.

But for swordfish's Box<dyn PipelineNode>, I'm not sure how to call fmt_tree because it requires Self : Sized. So i made fmt_tree_gitstyle public and just call it directly.

pub trait AsciiTreeDisplay: TreeDisplay {
    // // Print the whole tree represented by this node.
    fn fmt_tree<'a, W: fmt::Write + 'a>(&self, w: &'a mut W, simple: bool) -> fmt::Result
    where
        Self: Sized,
    {
        let level = if simple {
            crate::DisplayLevel::Compact
        } else {
            crate::DisplayLevel::Default
        };

        fmt_tree_gitstyle(self, 0, w, level)
    }

@colin-ho colin-ho requested a review from samster25 January 11, 2025 00:26
@colin-ho colin-ho marked this pull request as ready for review January 11, 2025 00:26
@samster25
Copy link
Member

I think @universalmind303 would be a better reviewer for this

@samster25 samster25 requested review from universalmind303 and removed request for samster25 January 21, 2025 18:36
@colin-ho colin-ho merged commit 603199f into main Jan 27, 2025
42 of 43 checks passed
@colin-ho colin-ho deleted the colin/swordfish-explain branch January 27, 2025 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants