-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Approximate seeking mode #427
Comments
This would be great. Quite eager to try it in Litdata: https://github.com/Lightning-AI/litdata |
Sharing some thoughts on how we should go about implementing this based on some code diving. First, I think we want to implement this mode in the C++ layer, not just the Python. I think it's too fundamental a concept to just try to make it work outside the C++ In the
|
Work in progress PR #440. It takes the approach mentioned above, and also exposes it at the Python layer. It still has some bugs, but I think this proves out the general approach. |
Update: PR #440 passes all tests and is showing the expected performance. See current benchmark numbers on the PR. |
Implemented in #440. |
🚀 The feature
TorchCodec's public
VideoDecoder
should have an approximate seek mode. Users should be able to specify they want the mode when they instantiate the decoder.Motivation, pitch
The primary motivation is performance. Currently, TorchCodec always performs exact seeks. We accomplish the exact seeks by first scanning the entire video file, and building up our own frame-table internally. This means we're not susceptible to bad header metadata. But it adds an upfront linear cost to all decoding. This hurts performance for both large files, and when the decoding pattern is sequential from the start.
This is a high priority feature, as it should help to address some current performance issues users are seeing.
The text was updated successfully, but these errors were encountered: