-
Notifications
You must be signed in to change notification settings - Fork 802
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: check name when getting PyCapsule pointer #3585
Conversation
d928333
to
be7c625
Compare
/// The `name` should match the path to the module attribute exactly in the form | ||
/// of `"module.attribute"`, which should be the same as the name within the capsule. | ||
/// If this capsule represents a module attribute, the `name` should match the path | ||
/// to the module attribute exactly in the form of `"module.attribute"`, which should | ||
/// be the same as the name within the capsule. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I felt this paragraph was oddly prescriptive, given there are other use cases for PyCapsules
than module attributes. (For example, we use them in Arrow to transport Array in the Arrow PyCapsule Interface.) If you feel otherwise, I can revert this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It comes from the documentation for PyCapsule_Import
Based on your adjustment, you wanted to clarify that capsules can be used for things other than module attributes. This is true, though this function only succeeds for module attributes as far as I am aware?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see what you are saying. I was looking at this because it's the only API where you can pass a name and have that validated. So instead, I think we want to call GetPointer
, right?
c423ff6
to
c28da26
Compare
CodSpeed Performance ReportMerging #3585 will improve performances by 20.28%Comparing Summary
Benchmarks breakdown
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
Handing out &mut
references to global data makes me a little uneasy. It feels like it would be extremely easy to accidentally break the safety invariant of no concurrent accesses, especially as the protections of the GIL are weakening in the near future.
Can you provide examples of your use cases for &mut
in a bit more detail? My gut reaction is that it would be desirable to continue to only offer the shared reference API (which is itself unsafe) and instead encourage users to use interior mutability with appropriate synchronisation mechanisms.
/// The `name` should match the path to the module attribute exactly in the form | ||
/// of `"module.attribute"`, which should be the same as the name within the capsule. | ||
/// If this capsule represents a module attribute, the `name` should match the path | ||
/// to the module attribute exactly in the form of `"module.attribute"`, which should | ||
/// be the same as the name within the capsule. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It comes from the documentation for PyCapsule_Import
Based on your adjustment, you wanted to clarify that capsules can be used for things other than module attributes. This is true, though this function only succeeds for module attributes as far as I am aware?
It's in an unsafe
My use case is implementing the Arrow PyCapsule Interface. Consumers of there PyCapsules need to move the contents of the capsule to take ownership, which requires mutable access. This was already implemented using I think I was confusing |
Right, I understand what you're looking for now. I see the following code which I'm taking to be an example of the pattern you use at the moment: validate_pycapsule(array_capsule, "arrow_array")?;
let array_ptr = array_capsule.pointer() as *mut FFI_ArrowArray;
let array = unsafe { std::ptr::replace(array_ptr, FFI_ArrowArray::empty()) }; If we just swapped validate_pycapsule(array_capsule, "arrow_array")?;
let array_mut_ref = unsafe { array_capsule.mut_reference::<FFI_ArrowArray>() };
let array = std::mem::replace(array_mut_ref, FFI_ArrowArray::empty()); So the
If we pull back on the let array = unsafe { std::ptr::replace(array_capsule.pointer_checked("arrow_array").cast(), FFI_ArrowArray::empty()) }; Is that what you were thinking too? I don't really love the name "pointer_checked", so I'm open to better names. Could just be called Also the last time I decided to not check the name originates from here - #2485 (comment) |
I would like to add my concern that this API is just too unsafe for my taste, i.e. validating the global invariants required to produce a mutable reference is too error prone to encourage doing it via a dedicated method. Concerning the name checking. Maybe it would be better to have method that turns PyCapsule::from_object(obj: &PyAny, name: &str) -> PyResult<&PyCapsule>; |
I hear you both on the safety concerns. Having something that returns a (I'm really scratching my head as to why all the bindings libraries--PyO3, pybind11, and nanobind--just pass in the capsules name instead of taking a user-passed name for
Yeah, exactly. I wonder if this API would make sense: /// Get the pointer or return an error
///
/// Returns an error if the names do not match or the pointer is null.
unsafe fn pointer_checked<T>(&self, name: &str) -> PyResult<NonNull<T>> { ... } I like it because:
Then my use case becomes nice and short: let array_ptr = unsafe { capsule.pointer_checked::<FFIArrowArray>("arrow_array")? } ;
let array = unsafe { std::ptr::replace(array_ptr, FFI_ArrowArray::empty()) }; What do you think of that? |
I think the idea with Of course, this does not fully work out due to the Hence, my suggestion to go for So in summary, I think our API tried hard to imbue better invariants into the type but falls short where it must directly interact with the native types. |
I see, so instead of thinking of let capsule = TypedPyCapsule::<FFIArrowArray>::from_object(obj, "arrow_array")?;
let pointer: NonNull<FFIArrowArray> = capsule.get_pointer()?; One thing I like about the existing |
I have definitely wondered in the past about a |
The typed version brings with it a necessarily unsafe type assertion as David points out. Hence, I think a plain |
Right, of course. That would make it: let capsule = unsafe { TypedPyCapsule::<FFIArrowArray>::from_object(obj, "arrow_array")? };
let pointer: NonNull<FFIArrowArray> = capsule.get_pointer()?;
At this point, I feel like it just doesn't bring that much more safety if it's going to return an untyped |
Fixes #3584