feat: Improve error message with lazy data frame by explicitly materializing before falling back to dplyr #456

krlmlr · 2025-01-11T13:51:22Z

Closes #432.

github-actions · 2025-01-11T15:28:23Z

This is how benchmark results would change (along with a 95% confidence interval in relative change) if aeb96f3 is merged into main:

✔️001_tpch_01: 23.3ms -> 22.9ms [-5.93%, +2.72%]
✔️001_tpch_02: 67ms -> 66.5ms [-2.01%, +0.39%]
✔️001_tpch_03: 38.4ms -> 38.4ms [-0.66%, +0.99%]
✔️001_tpch_04: 22.1ms -> 21.9ms [-3.05%, +0.92%]
✔️001_tpch_05: 56ms -> 56.3ms [-0.36%, +1.56%]
✔️001_tpch_06: 13.6ms -> 13.9ms [-0.87%, +4.23%]
✔️001_tpch_07: 73.7ms -> 73.5ms [-1.09%, +0.47%]
✔️001_tpch_08: 95.4ms -> 95.5ms [-0.57%, +0.62%]
✔️001_tpch_09: 71.3ms -> 71.6ms [-1.19%, +1.97%]
✔️001_tpch_10: 47.9ms -> 47.6ms [-1.87%, +0.56%]
✔️001_tpch_11: 32.1ms -> 31.9ms [-2.32%, +0.96%]
✔️001_tpch_12: 27ms -> 27ms [-1.79%, +2%]
✔️001_tpch_13: 24.6ms -> 24.8ms [-0.87%, +2.92%]
✔️001_tpch_14: 19.8ms -> 19.8ms [-1.45%, +1.44%]
✔️001_tpch_15: 31.1ms -> 31.3ms [-0.13%, +1.72%]
✔️001_tpch_16: 39.1ms -> 39.4ms [-0.42%, +1.95%]
✔️001_tpch_17: 25.6ms -> 25.8ms [-1.34%, +2.63%]
✔️001_tpch_18: 22.2ms -> 22.1ms [-3.05%, +2.48%]
✔️001_tpch_19: 65.9ms -> 65.7ms [-1.03%, +0.35%]
✔️001_tpch_20: 49.8ms -> 49.6ms [-1.51%, +0.37%]
✔️001_tpch_21: 78.4ms -> 78.5ms [-0.85%, +0.96%]
❗🐌001_tpch_22: 65.2ms -> 66ms [+0.14%, +2.19%]
❗🐌010_tpch_01: 79ms -> 82.4ms [+0.64%, +8%]
✔️010_tpch_02: 71.3ms -> 69.6ms [-6%, +1.2%]
✔️010_tpch_03: 58.1ms -> 58.7ms [-1.17%, +3.3%]
✔️010_tpch_04: 43.6ms -> 42.2ms [-8.5%, +2.06%]
✔️010_tpch_05: 88.7ms -> 88.2ms [-1.93%, +0.84%]
✔️010_tpch_06: 32.3ms -> 31ms [-11.2%, +2.97%]
✔️010_tpch_07: 105ms -> 106ms [-0.41%, +2.07%]
✔️010_tpch_08: 126ms -> 128ms [-1.32%, +3.84%]
✔️010_tpch_09: 115ms -> 114ms [-1.59%, +0.33%]
✔️010_tpch_10: 73.4ms -> 73.4ms [-1.05%, +1.09%]
✔️010_tpch_11: 37.9ms -> 38ms [-4.57%, +5.46%]
✔️010_tpch_12: 59ms -> 57.9ms [-9.95%, +6.25%]
✔️010_tpch_13: 52.9ms -> 52.3ms [-6.64%, +4.14%]
✔️010_tpch_14: 37.9ms -> 37.2ms [-5.68%, +2.28%]
✔️010_tpch_15: 55ms -> 55.9ms [-4.15%, +7.37%]
✔️010_tpch_16: 43ms -> 44.3ms [-1.19%, +6.99%]
✔️010_tpch_17: 54.8ms -> 53ms [-9.91%, +3.42%]
✔️010_tpch_18: 53.2ms -> 52.2ms [-8.79%, +5.11%]
✔️010_tpch_19: 116ms -> 117ms [-2.55%, +4.44%]
✔️010_tpch_20: 66.1ms -> 64.9ms [-6.17%, +2.37%]
✔️010_tpch_21: 239ms -> 242ms [-1.44%, +3.99%]
🚀010_tpch_22: 76.2ms -> 74.9ms [-3.51%, -0.05%]
✔️100_tpch_01: 325ms -> 330ms [-9.28%, +12.38%]
✔️100_tpch_02: 124ms -> 125ms [-1.65%, +4.15%]
✔️100_tpch_03: 181ms -> 185ms [-3.03%, +7.41%]
✔️100_tpch_04: 155ms -> 161ms [-6.33%, +13.89%]
✔️100_tpch_05: 257ms -> 268ms [-8.11%, +15.94%]
✔️100_tpch_06: 112ms -> 106ms [-18.1%, +7.08%]
✔️100_tpch_07: 227ms -> 238ms [-2.54%, +12.72%]
✔️100_tpch_08: 266ms -> 255ms [-11.86%, +4.13%]
🚀100_tpch_09: 350ms -> 317ms [-15.28%, -3.73%]
✔️100_tpch_10: 213ms -> 221ms [-6.47%, +13.23%]
✔️100_tpch_11: 91ms -> 96.4ms [-20.13%, +32.14%]
✔️100_tpch_12: 189ms -> 188ms [-14.12%, +13.84%]
✔️100_tpch_13: 306ms -> 304ms [-4.42%, +2.8%]
🚀100_tpch_14: 126ms -> 116ms [-15.14%, -1.11%]
✔️100_tpch_15: 220ms -> 206ms [-16.28%, +3.39%]
✔️100_tpch_16: 129ms -> 123ms [-10.7%, +1.28%]
✔️100_tpch_17: 186ms -> 175ms [-14.38%, +2.72%]
✔️100_tpch_18: 199ms -> 194ms [-8.86%, +3.52%]
✔️100_tpch_19: 283ms -> 284ms [-4.9%, +5.92%]
✔️100_tpch_20: 181ms -> 173ms [-11.48%, +1.97%]
✔️100_tpch_21: 1.31s -> 1.31s [-6.93%, +6.76%]
✔️100_tpch_22: 173ms -> 175ms [-4.75%, +6.96%]

Further explanation regarding interpretation and methodology can be found in the documentation.

github-actions · 2025-01-11T19:35:07Z

This is how benchmark results would change (along with a 95% confidence interval in relative change) if c48a9d8 is merged into main:

✔️001_tpch_01: 23.4ms -> 23.9ms [-1.6%, +6.3%]
✔️001_tpch_02: 67.2ms -> 67.3ms [-0.44%, +0.97%]
✔️001_tpch_03: 40.1ms -> 40.3ms [-1.07%, +1.73%]
✔️001_tpch_04: 22.2ms -> 22.3ms [-1.22%, +2.24%]
✔️001_tpch_05: 57ms -> 57.3ms [-0.35%, +1.22%]
✔️001_tpch_06: 14.3ms -> 14.1ms [-4.24%, +1.99%]
✔️001_tpch_07: 74ms -> 74.3ms [-0.2%, +0.99%]
✔️001_tpch_08: 97.6ms -> 97.4ms [-1.07%, +0.77%]
✔️001_tpch_09: 72.2ms -> 72.7ms [-0.41%, +1.6%]
✔️001_tpch_10: 48.9ms -> 48ms [-4.66%, +0.8%]
✔️001_tpch_11: 32.1ms -> 32.2ms [-2.03%, +2.55%]
🚀001_tpch_12: 27.5ms -> 26.8ms [-5.59%, -0.16%]
✔️001_tpch_13: 25.3ms -> 25.3ms [-3.38%, +2.99%]
✔️001_tpch_14: 20.2ms -> 20.1ms [-2.25%, +1.89%]
✔️001_tpch_15: 31.4ms -> 31.3ms [-1.7%, +1.3%]
✔️001_tpch_16: 39.5ms -> 39.8ms [-0.8%, +2.33%]
✔️001_tpch_17: 25.9ms -> 26.2ms [-0.51%, +3.11%]
✔️001_tpch_18: 22.1ms -> 22.1ms [-1.79%, +2.02%]
✔️001_tpch_19: 66.1ms -> 66ms [-0.91%, +0.61%]
✔️001_tpch_20: 50.5ms -> 50.2ms [-1.67%, +0.2%]
✔️001_tpch_21: 78.4ms -> 77.9ms [-1.48%, +0.17%]
✔️001_tpch_22: 66.2ms -> 66.4ms [-0.44%, +1.22%]
✔️010_tpch_01: 84.6ms -> 81.2ms [-13.49%, +5.41%]
✔️010_tpch_02: 73.1ms -> 74ms [-1.08%, +3.6%]
✔️010_tpch_03: 60.8ms -> 60.2ms [-3.92%, +2.07%]
✔️010_tpch_04: 43.3ms -> 43.9ms [-1.2%, +4.4%]
✔️010_tpch_05: 90ms -> 89.4ms [-2.11%, +0.62%]
✔️010_tpch_06: 31.7ms -> 32.2ms [-3.71%, +6.91%]
✔️010_tpch_07: 106ms -> 106ms [-1.18%, +1.43%]
✔️010_tpch_08: 128ms -> 128ms [-3.8%, +3.52%]
✔️010_tpch_09: 116ms -> 116ms [-2.75%, +2.65%]
✔️010_tpch_10: 76.3ms -> 73.7ms [-7.74%, +0.81%]
✔️010_tpch_11: 37.9ms -> 37.2ms [-4.31%, +0.6%]
✔️010_tpch_12: 57ms -> 57.2ms [-2.55%, +3.24%]
✔️010_tpch_13: 52.5ms -> 53.9ms [-6.54%, +11.55%]
✔️010_tpch_14: 37.1ms -> 37ms [-1.25%, +0.87%]
✔️010_tpch_15: 55.4ms -> 54.3ms [-7.48%, +3.77%]
✔️010_tpch_16: 44ms -> 43.6ms [-3.39%, +1.75%]
✔️010_tpch_17: 55.2ms -> 53.6ms [-7.32%, +1.75%]
✔️010_tpch_18: 53.9ms -> 52.9ms [-8.62%, +4.92%]
✔️010_tpch_19: 116ms -> 116ms [-1.97%, +1.38%]
✔️010_tpch_20: 64.9ms -> 65.7ms [-1.23%, +3.62%]
✔️010_tpch_21: 240ms -> 236ms [-5.06%, +1.48%]
✔️010_tpch_22: 75.6ms -> 74.8ms [-2.52%, +0.41%]
✔️100_tpch_01: 333ms -> 334ms [-10.04%, +10.25%]
✔️100_tpch_02: 130ms -> 129ms [-10.7%, +8.24%]
✔️100_tpch_03: 189ms -> 185ms [-10.6%, +6.16%]
✔️100_tpch_04: 162ms -> 151ms [-15.63%, +2.64%]
✔️100_tpch_05: 278ms -> 283ms [-10.74%, +14.41%]
✔️100_tpch_06: 106ms -> 105ms [-26.42%, +24.25%]
✔️100_tpch_07: 249ms -> 251ms [-2.61%, +4.56%]
✔️100_tpch_08: 266ms -> 252ms [-17.78%, +7.16%]
✔️100_tpch_09: 361ms -> 375ms [-7.89%, +15.73%]
✔️100_tpch_10: 217ms -> 224ms [-6.92%, +13.17%]
✔️100_tpch_11: 84.7ms -> 93ms [-18.46%, +38.2%]
✔️100_tpch_12: 195ms -> 195ms [-14.62%, +14.62%]
✔️100_tpch_13: 340ms -> 333ms [-10.06%, +5.95%]
✔️100_tpch_14: 122ms -> 120ms [-14.48%, +11.64%]
✔️100_tpch_15: 221ms -> 214ms [-14.17%, +7.74%]
✔️100_tpch_16: 125ms -> 129ms [-8.52%, +13.95%]
✔️100_tpch_17: 176ms -> 174ms [-14.67%, +13.01%]
✔️100_tpch_18: 198ms -> 209ms [-2.88%, +14.4%]
✔️100_tpch_19: 302ms -> 296ms [-16.57%, +12.94%]
✔️100_tpch_20: 191ms -> 182ms [-17.44%, +7.63%]
✔️100_tpch_21: 1.33s -> 1.3s [-5.2%, +1.08%]
✔️100_tpch_22: 174ms -> 171ms [-4.56%, +1.53%]

Further explanation regarding interpretation and methodology can be found in the documentation.

krlmlr changed the title ~~f 432 mat error~~ Squashed commit of the following: Jan 11, 2025

krlmlr enabled auto-merge (squash) January 11, 2025 13:51

krlmlr marked this pull request as draft January 11, 2025 13:52

auto-merge was automatically disabled January 11, 2025 13:52
Pull request was converted to draft

krlmlr changed the title ~~Squashed commit of the following:~~ feat: Improve error message with lazy data frame by explicitly materializing before falling back to dplyr Jan 11, 2025

krlmlr added 7 commits January 11, 2025 15:12

feat: Improve group_by() error message with lazy data frame

0632faf

Call check_lazy() everywhere

578a0bd

Gen

740f9f3

Pretty

a29e246

Gen

6854e75

Patch

9ec9dd2

Point to help

aeb96f3

krlmlr force-pushed the f-432-mat-error branch from bb0a25d to aeb96f3 Compare January 11, 2025 14:12

krlmlr marked this pull request as ready for review January 11, 2025 14:12

krlmlr enabled auto-merge (squash) January 11, 2025 14:13

krlmlr added 3 commits January 11, 2025 19:16

Add dummy

4925487

Patch

e39d7b9

Unrelated patch

c48a9d8

krlmlr merged commit 6d432e1 into main Jan 11, 2025
20 checks passed

krlmlr deleted the f-432-mat-error branch January 11, 2025 18:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Improve error message with lazy data frame by explicitly materializing before falling back to dplyr #456

feat: Improve error message with lazy data frame by explicitly materializing before falling back to dplyr #456

krlmlr commented Jan 11, 2025 •

edited

Loading

github-actions bot commented Jan 11, 2025

github-actions bot commented Jan 11, 2025

feat: Improve error message with lazy data frame by explicitly materializing before falling back to dplyr #456

feat: Improve error message with lazy data frame by explicitly materializing before falling back to dplyr #456

Conversation

krlmlr commented Jan 11, 2025 • edited Loading

github-actions bot commented Jan 11, 2025

github-actions bot commented Jan 11, 2025

krlmlr commented Jan 11, 2025 •

edited

Loading