Releases · VikParuchuri/surya

Inline math detection

Add new inline math detection model and benchmark

Textract OCR benchmark

Benchmark surya against textract as well as google cloud vision. For just english, results look like:

Model	Time per page (s)	Avg Score	English
surya	0.522628	0.983298	0.983298
textract	1.44293	0.947458	0.947458

XLA support

Add support for TPUs. Still fairly slow, but lots of optimizations to be made.

Minor speedups

Refactor inference to get a 5-10% speed boost across all models.

What's Changed

Add XLA support by @iammosespaulr in #298
Add Inline Math Detection by @tarun-menta in #297
Update to new line detection model by @tarun-menta in #305
Fix merging of inline boxes by drawing textlines in heatmap by @tarun-menta in #309
XLA improvements by @VikParuchuri in #306
Update inline math checkpoint by @VikParuchuri in #310
Misc Line Detection Fixes by @tarun-menta in #313
Add Textract OCR Benchmark by @tarun-menta in #307
Inline math model, new text detection model by @VikParuchuri in #312

Full Changelog: v0.10.3...v0.11.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Contributors

What's Changed

Contributors

Download models with S3

Misc fixes

What's Changed

Contributors

Inline math detection

Textract OCR benchmark

XLA support

Minor speedups

What's Changed

Contributors

New OCR model and streamlit app

What's Changed

New Contributors

Contributors

Releases: VikParuchuri/surya

New text detection model

What's Changed

Contributors

Improved inline math model

What's Changed

Contributors

Model downloads with S3

Download models with S3

Misc fixes

What's Changed

Contributors

Fix inline detection bug

v0.11.0

Inline math detection

Textract OCR benchmark

XLA support

Minor speedups

What's Changed

Contributors

Fix height issue

Fix pytorch 2.6 bug

Pin pytorch

Add LaTeX OCR model

New OCR model and streamlit app

What's Changed

New Contributors

Contributors

Fix cli script issue