You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have had several requests to be able to decode streaming data, instead of requiring that the entire file be present. Upon creating a VideoDecoder, users would have some way to indicate that the file they are decoding will be streamed rather than entirely present on the file system.
Motivation, pitch
The main motivation is performance, which can come in two ways:
Overlapping decoding with copying the data from an external system.
Avoiding sending the bytes that are never used. That is, if we sample 10 frames from a 100 GB file, it would be great if we don't need to retrieve the entire 100 GB file.
Now that we have approximate mode (see #427), this is more feasible. In approximate mode, we only need to read the very beginning of the file upon startup, which aligns with streaming data. Exact mode would not work with streaming data.
The text was updated successfully, but these errors were encountered:
Would be great to be able to pass a file object to the VideoDecoder just like in pyav. We're using the lance data format which has a blob api to support streaming in this way. https://lancedb.github.io/lance/blob.html
It would be awesome for VideoDecoder to load frames from remote video files without having to download the full file first.
One way could be to add support for file-like objects as input to VideoDecoder, or support fsspec:
url="https://..."# or hf://..., s3://...withfsspec.open(url) asf:
decoder=VideoDecoder(f)
decoder[0] # only loads the first frame from the remote file
This would be useful in training setups like in https://github.com/huggingface/lerobot where we sample video frames when training a model for robotics, especially if we want to stream the data from video files hosted in a Hugging Face dataset.
🚀 The feature
We have had several requests to be able to decode streaming data, instead of requiring that the entire file be present. Upon creating a
VideoDecoder
, users would have some way to indicate that the file they are decoding will be streamed rather than entirely present on the file system.Motivation, pitch
The main motivation is performance, which can come in two ways:
Now that we have approximate mode (see #427), this is more feasible. In approximate mode, we only need to read the very beginning of the file upon startup, which aligns with streaming data. Exact mode would not work with streaming data.
The text was updated successfully, but these errors were encountered: