Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

odbcPreviewObject: Optimize SELECT against rowLimit #525

Merged
merged 1 commit into from
Jan 6, 2023

Conversation

detule
Copy link
Collaborator

@detule detule commented Dec 28, 2022

This closes #502

When previewing large tables, optimize the SELECT * query since we are only interested in a limited number of rows.

Testing locally against SQL Server with tables as larges as 70K rows shows limited benefits. However, when testing against a Snowflake instance with > 1B rows tables, benefit is very tangible.

Note: Current default for the query of the preview format is SELECT * FROM <table> LIMIT N, with exceptions for SQL Server, and Oracle. Tested the LIMIT N syntax successfully against MYSQL, PSQL, DB2, SNOWFLAKE. If there are users of other more esoteric back-ends that do not support the LIMIT N suffix, we can add S3 methods for them.

Alternative would have been to leave the default preview query un-optimized as SELECT * FROM <table>, and write S3 methods for all of the above that route to the LIMIT N implementation.

I have a small bias for the way it is written in this commit ( assume LIMIT N works for other back-ends and address as needed ), but can go either way.

Thanks

@detule detule requested a review from hadley December 28, 2022 12:05
Copy link
Member

@hadley hadley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth a NEWS bullet too, I think.

@@ -238,7 +238,39 @@ odbcPreviewObject.OdbcConnection <- function(connection, rowLimit, table = NULL,
name <- paste(dbQuoteIdentifier(connection, catalog), name, sep = ".")
}

dbGetQuery(connection, paste("SELECT * FROM", name), n = rowLimit)
dbGetQuery(connection, odbcPreviewQuery(connection, rowLimit, name ),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argument alignment; again http://github.com/lionel-/codegrip is likely the easiest way to see what we want.


#' Common top-N syntax ( MYSQL, PSQL, DB2, SNOWFLAKE, etc )
#' @rdname odbcPreviewQuery
odbcPreviewQuery.OdbcConnection <- function(connection, rowLimit, name) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably worth a couple of tests?

@@ -78,9 +78,10 @@ test_that("MySQL", {
con <- DBItest:::connect(DBItest:::get_default_context())
dbWriteTable(con, tblName, data.frame(a = 1:10L))
on.exit(dbRemoveTable(con, tblName))
# There should be no "Pending rows" warning
expect_warning({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use expect_no_warning() now, for a bit more clarity.

@detule detule force-pushed the feature/preview_query branch from bcc1966 to fa3e3ed Compare January 6, 2023 01:46
@detule detule merged commit 111cedf into r-dbi:main Jan 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Limit on query for previewing large tables
2 participants