Skip to content

Releases: VikParuchuri/surya

New text detection model

28 Feb 21:53
7e5ac9d
Compare
Choose a tag to compare

New text detection model detects more text across a range of PDFs. Should improve OCR performance.

Fix bug with model downloads.

What's Changed

Full Changelog: v0.12.1...v0.13.0

Improved inline math model

24 Feb 16:34
acb87df
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.12.0...v0.12.1

Model downloads with S3

19 Feb 16:32
7a79fb5
Compare
Choose a tag to compare

Download models with S3

  • Improve speed and reliability by downloading models with S3

Misc fixes

  • Use opencv headless to avoid GUI dependencies

What's Changed

Full Changelog: v0.11.1...v0.12.0

Fix inline detection bug

13 Feb 01:31
5b61bd7
Compare
Choose a tag to compare
  • Fix streamlit bug
  • Fix inline detection bug

v0.11.0

10 Feb 17:28
d349f30
Compare
Choose a tag to compare

Inline math detection

  • Add new inline math detection model and benchmark
image

Textract OCR benchmark

Benchmark surya against textract as well as google cloud vision. For just english, results look like:

Model Time per page (s) Avg Score English
surya 0.522628 0.983298 0.983298
textract 1.44293 0.947458 0.947458

XLA support

Add support for TPUs. Still fairly slow, but lots of optimizations to be made.

Minor speedups

Refactor inference to get a 5-10% speed boost across all models.

What's Changed

Full Changelog: v0.10.3...v0.11.0

Fix height issue

06 Feb 21:03
06a3cc6
Compare
Choose a tag to compare

Fix an issue where text detection wouldn't resize images properly, leading to bounding boxes in the wrong place in tall images.

Fix pytorch 2.6 bug

31 Jan 02:36
780d351
Compare
Choose a tag to compare

Fix bug that caused issues on MPS (Mac) devices when using pytorch 2.6.

Pin pytorch

30 Jan 18:07
551584b
Compare
Choose a tag to compare

Pytorch 2.6.0 doesn't work well with some of the models on MPS (Mac), so pinning to the old version.

Add LaTeX OCR model

29 Jan 15:04
31d9126
Compare
Choose a tag to compare

New OCR model and streamlit app

  • Release a new LaTeX OCR model
  • Add streamlit app to interactively select and OCR equations
image

What's Changed

New Contributors

Full Changelog: v0.9.3...v0.10.0

Fix cli script issue

24 Jan 15:39
aa8ee5a
Compare
Choose a tag to compare

Fix issue with cli scripts and folders.