-
Notifications
You must be signed in to change notification settings - Fork 881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Discussion] Object Store Composition #7171
Comments
Just some questions since I'm still fairly green when it comes to really understanding the intricacies of Rust's type system...
I'm not sure what you mean when you say that wrappers must avoid de-specializing. Does that have something to do with compiler optimizations?
So this would also involve using something like the |
They might provide their own implementations of things like delete_stream, the wrappers must therefore call through to this instead of relying on the default implementation
Perhaps, but given things like AsyncFileReader are per-file, it may be that they can just be constructed with the relevant context |
I basically agree with this statement of challenge, though I am not sure how hard it actually is in practice (having done it myself and seen various different versions of it)
I think the parquet reader also needs to be able to get the total file sizes as well (at least unless the negative ranges to fetch end bytes is supported). Adding support to
I would say "perverse" is somewhat subjective. It is certainly complex but that also needs to be measured against
This feels like a good idea to me
I think OpenDAL https://github.com/apache/opendal is trying to provide generic all-purpose IO subsystem abstraction So my personal recommendation is
|
It occurs to me this might be a great time to provide easier integration for using parquet-rs reader with OpenDAL directly (as in have a |
I will let @Xuanwo weigh in here, but I think OpenDAL is in a very similar place to ObjectStore w.r.t this, and has very similar issues. Both are abstractions for data "access", not a general IO subsystem abstraction. Edit: Ultimately something has to glue together downstream abstractions, e.g. in the context of #7135 providing a way to connect DF's SessionContext through to some IO subsystem. Either DF needs to overload some existing interface e.g. ObjectStore/OpenDAL inevitably leading to challenges like #7155 or it needs to define its own mechanism. In the case of parquet and AysncFileReaderFactory, this interface already exists we just need to point people at it.
This isn't a requirement - see here
I remain pretty lukewarm on this, given parquet-opendal already does this. |
I agree we are going to need some sort of API to pass context through the various API layers 👍 |
Thank you @tustvold for inviting me to join this discussion. I believe we should build If this becomes a reality, DataFusion can design the abstraction based on its own requirements without having to push everything upstream to We can start by aliasing the |
ThoughtsStatus Quo at Influx
(from #7171 (comment) ) I've just looked at our setup and this is how the hierarchy currently looks like (from outer to inner):
Not saying that this is what people usually do, but I thought that insight might be helpful. Trait methods
(from #7171 (comment) ) I think that by itself was a design mistake (in hindsight). There should be ONE single GET method (probably
(from #7171 (comment) ) At Influx we use Sans-IO / IO Abstraction
(from #7171 (comment) ) Depends. I would say "more low-level", but now you're in a different pickle because you now longer see the high-level methods or at least you partially have to recover them (e.g. is it an upload/put, a copy, a move). So I don't think the instrumentation on that level is universally better. I do however agree that splitting the IO layer is the right move. |
FWIW this is an alternative proposal I had written up as part of the issue (before I accidentally closed the tab and lost it). The challenge with this approach is there are methods that have different signatures, e.g. whilst delete_stream has a default implementation in terms of delete, the whole purpose is to allow specialization. The same can be said for get_ranges or list_with_delimiter. Ultimately when I tried this, the extension trait would allow us to remove the basic non-opts variants, but many of the methods are not just simple proxies to a _opts call and need to be specializable.
My vague thought here is we might be able to do something with http::Extensions to propagate this context lower. |
That sounds like a compromise I could live with. That said, I don't think any of the proposals really gonna help eliminating wrappers. Looking at the hierarchy posted here, sure we could move the metrics into the IO layer. Chunking is already harder because now you again need to "reverse" the high-level intention from the low-level HTTP call. Racing might work. Caching not really. And putting everything into the DataFusion-specific wrappers (which might not be your only high-level consumer) just moves the problem or makes it even worse: now the entire code base uses the data fusion type instead of the |
Could you expand on what you mean by chunking? |
If you have a large object (like 100MB), instead of using a single GET request for the whole range, using multiple requests in -- let's say -- 16MB chunks. Our internal testing showed that while the throughput and hence latency of a single S3 GET request is somewhat limited, you can reduce the overall latency and increase the throughput using that method. |
I wonder if that makes more sense being done explicitly on top of the ObjectStore API, instead of as a layer within a wrapper? There have been some discussions in the past about providing something similar to AWS TransferManager to handle this sort of use-case. (e.g. #6837) |
Well, the chunking sits between the IO and the caching layer. So sure you can put a lot of things on top of |
I think this is an excellent idea -- I suggest the next step would be to open a ticket in DataFusion to discuss creating such an API @Xuanwo is filing a ticket something you are able to do? Otherwise I will try and find time to do so |
I'm willing to fill an issue at DF. |
With #7183 we will have a mechanism to introduce request-oriented middleware, along with more sophisticated logic for doing things like:
This will reduce the need for ObjectStore wrappers, as well as address a number of their more glaring deficiencies. However, it will not address the need to get context (#7155) through the ObjectStore trait either for use within an ObjectStore wrapper, or within the HTTPClient. As such I think we should proceed with #7170, as I don't really see a viable alternative to doing this. I also am excited by the possibilities of combining #7183 and #7170 to allow doing things like carrying information into S3 access logs as described here. |
Problem
Initially the ObjectStore API was relatively simple, consisting of a few methods to interact with object stores. As such many systems took this abstraction and used it as a generic IO abstraction, this is good and what the crate was designed for.
As people wanted additional functionality, such as instrumentation, caching or concurrency limiting, this was implemented by creating ObjectStore implementations that wrap the existing ones. Again this worked well.
However, over time the ObjectStore API has grown, and now has 8 required methods and a further 10 methods with default implementations. This creates a number of challenges for this wrapper based approach for composition.
API Surface
As a wrapper must avoid "despecializing" methods, it must implement all 18 methods. Not only is this burdensome, but creates upgrade hazards as new methods are added, potentially in non-breaking versions.
Additional Context
As the logic within these wrappers has grown more complex, there comes the need to pass additional information through to this logic. This motivates requests like #7155
Interface Creep
In many places the ObjectStore interface gets used as the abstraction for components that don't actually require the full breadth of ObjectStore functionality. There is no need, for example, for a parquet reader to depend on more than the ability to fetch ranges of bytes.
This leads to perverse "ObjectStore" implementations, that actually only implement say get functionality. Similarly in contexts like apache/datafusion#14286 it creates complexities around how to shim the full ObjectStore interface, despite the actual operators in question only using a very small subset of this functionality.
Request Correlation
As the ObjectStore logic has gotten more sophisticated, incorporating automatic retries, request batching, etc... the relationship between an ObjectStore method call and requests has gotten rather fuzzy. This makes implementing instrumentation, concurrency limiting, tokio task dispatch, etc... at this API boundary increasingly inaccurate/problematic.
Thoughts
I personally think we should encourage a move away from this wrapper based form of composition and instead do the following:
I can't help feeling right now ObjectStore is stuck between trying to expose the functionality of ObjectStore's in a portable and ergonomic fashion, whilst also trying to provide some sort of generic all-purpose IO subsystem abstraction, which I'm not sure aren't incompatible goals....
Tagging @alamb @crepererum @Xuanwo @waynr @kylebarron
The text was updated successfully, but these errors were encountered: