-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StateForDocumentValidator performance improvement #3189
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
68ea7ee
to
880b3c4
Compare
We've been investigating an incident where a particular document (content_id "c7346901-13fe-47df-a1f0-b583b78bf6e7", locale "cy") is taking an enormous amount of time to publish (30 to 120s) and thus preventing the publishing app from succeeding in a publish due to HTTP timeout errors. We've only managed to experience these issues in production and have traced them to this particular query. Despite the change we made 8 years ago [1], setting order(nil) before a first no longer seems to have an affect and we found it was generating a query of: ``` SELECT "editions".* FROM "editions" WHERE "editions"."document_id" = 267 AND "editions"."state" IN ('published', 'unpublished') AND "editions"."id" != 1699 ORDER BY "editions"."id" ASC LIMIT 1 ``` When analyzed in production this was incredibly slow: ``` publishing-api(prod)> Edition.where(document: live_cy_edition.document, state: %w[published unpublished]).where.not(id: live_cy_edition.id).order(:id).limit(1).explain(:analyze) => EXPLAIN (ANALYZE) SELECT "editions".* FROM "editions" WHERE "editions"."document_id" = 1147583 AND "editions"."state" IN ('published', 'unpublished') AND "editions"."id" != 16909044 ORDER BY "editions"."id" ASC LIMIT 1 /*application='PublishingAPI'*/ QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.56..3107.54 rows=1 width=785) (actual time=15911.808..15911.809 rows=0 loops=1) -> Index Scan using editions_pkey on editions (cost=0.56..4306281.11 rows=1386 width=785) (actual time=15911.808..15911.808 rows=0 loops=1) Filter: (((state)::text = ANY ('{published,unpublished}'::text[])) AND (id <> 16909044) AND (document_id = 1147583)) Rows Removed by Filter: 16414087 Planning Time: 0.183 ms Execution Time: 15911.834 ms (6 rows) ```` However we found it was not slow in staging as it was using a different query plan: ``` publishing-api(prod)> Edition.where(document: live_cy_edition.document, state: %w[published unpublished]).where.not(id: live_cy_edition.id).order(:id).limit(1).explain(:analyze) => EXPLAIN (ANALYZE) SELECT "editions".* FROM "editions" WHERE "editions"."document_id" = 1147583 AND "editions"."state" IN ('published', 'unpublished') AND "editions"."id" != 16907234 ORDER BY "editions"."id" ASC LIMIT 1 /*application='PublishingAPI'*/ QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=4946.22..4946.22 rows=1 width=791) (actual time=0.025..0.026 rows=0 loops=1) -> Sort (cost=4946.22..4949.53 rows=1323 width=791) (actual time=0.025..0.025 rows=0 loops=1) Sort Key: id Sort Method: quicksort Memory: 25kB -> Index Scan using index_editions_on_document_id_and_state on editions (cost=0.43..4939.60 rows=1323 width=791) (actual time=0.021..0.021 rows=0 loops=1) Index Cond: ((document_id = 1147583) AND ((state)::text = ANY ('{published,unpublished}'::text[]))) Filter: (id <> 16907234) Rows Removed by Filter: 1 Planning Time: 0.173 ms Execution Time: 0.048 ms (10 rows) ``` By switching this query to use a `pick` method we remove the ordering clause, which causes this query to be fast again in production. By doing this we also save having to hydrate the ActiveRecord model unnecessarily as we only use the id value from it. New query analyze: ``` publishing-api(prod)> Edition.where(document: live_cy_edition.document, state: %w[published unpublished]).where.not(id: live_cy_edition.id).limit(1).explain(:analyze) => EXPLAIN (ANALYZE) SELECT "editions".* FROM "editions" WHERE "editions"."document_id" = 1147583 AND "editions"."state" IN ('published', 'unpublished') AND "editions"."id" != 16909044 LIMIT 1 /*application='PublishingAPI'*/ QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.43..4.17 rows=1 width=785) (actual time=0.021..0.021 rows=0 loops=1) -> Index Scan using index_editions_on_document_id_and_state on editions (cost=0.43..5174.88 rows=1386 width=785) (actual time=0.020..0.021 rows=0 loops=1) Index Cond: ((document_id = 1147583) AND ((state)::text = ANY ('{published,unpublished}'::text[]))) Filter: (id <> 16909044) Rows Removed by Filter: 1 Planning Time: 0.135 ms Execution Time: 0.038 ms (7 rows) ``` [1]: d513940 Co-authored-by: Kevin Dew <kevindew@me.com>
We found that in `StateForDocumentValidator`, `order(nil).first` was causing a query to take a long time to complete (~15s). We only needed the ID for the record and found that using `pick(:id)` brought the query time down below a second, so we're doing the same in this validator, which has an equivalent requirement
91e459b
to
8582132
Compare
kevindew
approved these changes
Mar 7, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Update performance of validator to help with a current publishing incident
This application is owned by the publishing platform team. Please let us know in #govuk-publishing-platform when you raise any PRs.
Follow these steps if you are doing a Rails upgrade.