Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add ScalarUDF support in FFI crate #14579

Merged
merged 15 commits into from
Feb 19, 2025

Conversation

timsaucer
Copy link
Contributor

@timsaucer timsaucer commented Feb 10, 2025

Which issue does this PR close?

Addresses part of #14562 - specifically scalar udfs

Rationale for this change

This is a pure addition were we expose ScalarUDF across the FFI boundary. This enables new functionality about creating reusable libraries, such as in python.

What changes are included in this PR?

Addition of scalar udf FFI wrappers

Are these changes tested?

  • Create additional unit tests inside FFI crate
  • Expand the existing integration test to include scalar udf
  • Create a proof of concept in datafusion-python and test

Are there any user-facing changes?

This is a pure addition.

@timsaucer timsaucer changed the title initial commit for scalar udf in ffi crate Add ScalarUDF support in FFI crate Feb 13, 2025
@timsaucer timsaucer self-assigned this Feb 13, 2025
@timsaucer timsaucer marked this pull request as ready for review February 13, 2025 12:35
@timsaucer timsaucer changed the title Add ScalarUDF support in FFI crate feat: Add ScalarUDF support in FFI crate Feb 13, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @timsaucer - this looks good to me. I do think it would be valuable to seriously consider simplifying the Signature portion of the FFI to keep it smaller and more likely to be be stable

Also as I understand, having an FFI API for functions would make it easier for function libraries to be used without having to exactly match the DataFuson version which seems like a great idea to me.

#[repr(C)]
#[derive(StableAbi)]
#[allow(non_camel_case_types)]
pub struct FFI_Signature {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should keep this API simpler as I think the actual signature API is about to change in the near future. For example, the work from @jayzhan211 and @shehabgamin in the following PR

One thought I had was to to only support the equivalent of TypeSignature::UserDefined

While this would make it somewhat less conveniente to implement user defined functions I think it would result in much smaller FFI API and also be much less likely to change overtime

Maybe once the TypeSignature / Coecion code has settled down we could add more to the FFI interface

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this now match what you were looking for?

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thanks @timsaucer


pub volatility: FFI_Volatility,

pub return_type: unsafe extern "C" fn(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually it would be great to document these a bit more -- for example here we can mention that we only expose the return type method to simplify the signature

@timsaucer timsaucer merged commit 8ab0661 into apache:main Feb 19, 2025
24 checks passed
@timsaucer timsaucer deleted the feat/scalar-udf-ffi branch February 19, 2025 18:17
@alamb
Copy link
Contributor

alamb commented Feb 20, 2025

woohoo!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants