-
Notifications
You must be signed in to change notification settings - Fork 328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow Dataset Query Updates #2534
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2534 +/- ##
=========================================
Coverage 83.86% 83.86%
Complexity 1245 1245
=========================================
Files 238 238
Lines 5657 5657
Branches 271 271
=========================================
Hits 4744 4744
Misses 769 769
Partials 144 144
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@phixMe, great work on identifying ways to optimize our dataset query! But, mind adding a query plan, or analysis of the query before and after? I agree with the changes, but also think an analysis would be helpful to better understand the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed offline, these changes look great! Thanks for the perf improvements 💯
* Updating the sql for dataset get by name and namespace, and list endpoint * Update for test failure. * Adding qualifiers back in to list query --------- Co-authored-by: phix <peter.hicks@astronomer.io> Co-authored-by: Willy Lulciuc <willy@datakin.com>
Problem
These queries were slow for Marquez instances with many datasets, dataset versions, and facets.
Solution
These queries were slow for Marquez instances with many datasets, dataset versions, and facets.
One-line summary: Scopes down nested facet queries to be the same scope as the outer query.
Checklist
CHANGELOG.md
(Depending on the change, this may not be necessary)..sql
database schema migration according to Flyway's naming convention (if relevant)