-
Notifications
You must be signed in to change notification settings - Fork 583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend lambda support for ClickHouse and DuckDB dialects #1686
Conversation
@@ -13285,3 +13285,98 @@ fn test_trailing_commas_in_from() { | |||
"SELECT 1, 2 FROM (SELECT * FROM t1), (SELECT * FROM t2)", | |||
); | |||
} | |||
|
|||
#[test] | |||
fn test_lambdas() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved from databricks tests
@@ -83,69 +83,6 @@ fn test_databricks_exists() { | |||
); | |||
} | |||
|
|||
#[test] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved into common tests
src/dialect/mod.rs
Outdated
/// ```sql | ||
/// SELECT transform(array(1, 2, 3), x -> x + 1); -- returns [2,3,4] | ||
/// ``` | ||
fn supports_parensless_lambda_functions(&self) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly the idea with introducing this method is to avoid the Generic dialect's syntax conflict due to its pg json syntax support? If so I'm thinking it could more sense to turn off supports_lambda_functions
for the Generic dialect instead, idea with the dialect is that it gets feature support by default only if there aren't conflicting syntax. So that if its not expected that a dialect supports (x) -> y
but not x -> y
then maybe Generic dialect shouldn't support lambdas after all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You got it right. I think your point makes sense and I'm fine with turning off supports_lambda_functions
for the Generic dialect, but should supports_parensless_lambda_functions
be removed from the Dialect trait too? I want to run Datafusion with support for both lambdas (even if with limited syntax) and pg json syntax from datafusion-functions-json, and supports_parensless_lambda_functions
allows me to use a custom dialect to do so. In the worst case I can run without pg syntax and support only direct function calls (json_get
, etc)
cc @samuelcolvin in case you have interest in this too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have much context here, personally I think we'll want to switch off lambdas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, given expressions x->y
or (x)->y
: if a dialect supports both the pg json ->
operator and the lambda syntax then it seems to suggest an ambiguous grammar (it doesn't seem like there's a way to tell what either expression should be parsed into)?
My thinking was indeed to potentially turn off generic dialect and remove supports_parensless_lambda_functions
. Latter sounds like we'd be introducing a behavior to the parser that is only relevant in certain cominbations, neither covered by any of the parser's supported dialects nor a sql spec for reference which spontaneously feels like a slippery slope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. The lambda syntax looks identical to Pg JSON lookup syntax with a completely different meaning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. supports_parensless_lambda_functions
is removed and lambda support restricted to clickhouse, databricks and duckdb. I actually forgot the nested expr (expr)
, so yes, this conflicted with pg, it just wasn't caught by any test, plain wrong. Thanks 🙏
Looks like the CI errors are related to #1693 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks @gstvg and @samuelcolvin!
cc @alamb
Extend lambda support landed on #1257 to more dialects (but not Snowflake, see #1273)
Returns true for
supports_lambda_functions
for ClickHouse and DuckDBand GenericdialectsAddssupports_parensless_lambda_functions
toDialect
Returns true for supports_parensless_lambda_functions for ClickHouse, Databricks and DuckDB dialects, but not Generic to no conflict with Postgres JSONThis is a breaking change because now to parse a parensless lambda x -> x + 1 a dialect must also implement supports_parensless_lambda_functions