-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: to_cnf and to_dnf functions #3728
Conversation
CodSpeed Performance ReportMerging #3728 will degrade performances by 36.53%Comparing Summary
Benchmarks breakdown
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #3728 +/- ##
==========================================
+ Coverage 77.55% 77.58% +0.03%
==========================================
Files 729 729
Lines 91917 92005 +88
==========================================
+ Hits 71284 71383 +99
+ Misses 20633 20622 -11
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approving, but i'd prefer to merge the mod/tests into single file.
This PR adds the following: 1. `simplify` now recursively simplifies the entire expression tree 2. Simplification rules `simplify_and` and `simplify_or` to simplify boolean expressions 3. A simplify expressions step to the logical optimizer before scan task materialization I also did some refactoring work that did not change behavior: 1. Converted various functions to take in `Arc<T>` instead of `T`, which is more ergonomic to use and avoids a potential clone when calling it with an Arc variable using `Arc::unwrap_or_clone` 2. Split up simplify code into multiple files Combined with #3728, this PR brings major performance improvements to TPCH Q19 by pushing down and simplifying filter predicates.
The pre-order expression tree traversal algorithm I used for
to_dnf
turned out not to actually fully convert a statement to disjunctive normal form. Since it was also used into_cnf
, that did not fully work either.This new algorithm recursively converts subtrees to DNF form and propagates that upwards. In each boolean AND, the subtrees are assumed to be fully DNF, and it uses that assumption to merge the two sides into one DNF statement.
Also a quick fix for de Morgan's law on the case of triple negatives.