Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update heatmap.py #78

Merged
merged 1 commit into from
Apr 25, 2024
Merged

Update heatmap.py #78

merged 1 commit into from
Apr 25, 2024

Conversation

siddhawan
Copy link
Contributor

changed float16 to float32 while dividing mask with 255. speeds up the performance of detection postprocessing. More than 2x speed.

changed float16 to float32 while dividing mask with 255. speeds up the performance of detection postprocessing
@VikParuchuri VikParuchuri changed the base branch from master to dev April 24, 2024 15:59
@VikParuchuri VikParuchuri changed the base branch from dev to master April 24, 2024 15:59
@VikParuchuri
Copy link
Owner

Thanks for the PR! Where are you seeing the speedup? In the layout benchmark, I saw no real change (running on an M1 Mac).

I ran python benchmark/layout.py --max 5

float32

| Layout Type   |   precision |   recall |
|---------------|-------------|----------|
| Image         |        1    |     1    |
| Table         |        1    |     1    |
| Text          |        0.79 |     0.88 |
| Title         |        0.83 |     1    |
Took 12.39 seconds per image, and 62.0 seconds total.

float16

| Layout Type   |   precision |   recall |
|---------------|-------------|----------|
| Image         |        1    |     1    |
| Table         |        1    |     1    |
| Text          |        0.79 |     0.88 |
| Title         |        0.83 |     1    |
Took 12.69 seconds per image, and 63.5 seconds total.

@siddhawan
Copy link
Contributor Author

siddhawan commented Apr 24, 2024

Its in text line detection module using segformer. When the lines are huge in number. As in layout there aren't enough detections compared to line detections in newspaper like document.

@VikParuchuri
Copy link
Owner

The layout benchmark does text detection (and the layout model also uses segformer and the same postprocessing script). What pdf were you testing on? Do you mind sharing?

@siddhawan
Copy link
Contributor Author

siddhawan commented Apr 25, 2024

results using your command on gpu

float32

Layout Type precision recall
Image 1 1
Table 1 1
Text 0.79 0.88
Title 0.83 1
Took 1.78 seconds per image, and 8.9 seconds total.
Precision and recall are over the mutual coverage of the detected boxes and the ground truth boxes at a .5 threshold.
Wrote results to results/benchmark/layout_bench

float16

Layout Type precision recall
Image 1 1
Table 1 1
Text 0.79 0.88
Title 0.83 1
Took 2.66 seconds per image, and 13.3 seconds total.
Precision and recall are over the mutual coverage of the detected boxes and the ground truth boxes at a .5 threshold.
Wrote results to results/benchmark/layout_bench

from glob import glob

from PIL import Image
from surya.ocr import run_ocr
from surya.model.detection import segformer
from surya.detection import batch_text_detection

lst = glob('./images/BusinessStandard_20-04-2016_0001.jpg')

IMAGE_PATH = lst[0]
image = Image.open(IMAGE_PATH)

model, processor = segformer.load_model(), segformer.load_processor()
import torch
import time
for i in range(10):
st = time.time()
predictions = batch_text_detection([image], model, processor)[0]
polygons = [p.bbox for p in predictions.bboxes]
ed = time.time()
print(ed - st , 'Total time')

I am using this for my benchmark
BusinessStandard_20-04-2016_0001

float16
Detecting bboxes: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 2.53it/s]
5.684298038482666 Total time
Detecting bboxes: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 7.66it/s]
5.142843723297119 Total time
Detecting bboxes: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 8.10it/s]
5.188343524932861 Total time
Detecting bboxes: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 8.20it/s]
5.178464889526367 Total time
Detecting bboxes: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 8.23it/s]
5.151301622390747 Total time
Detecting bboxes: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 8.22it/s]
5.1321632862091064 Total time
Detecting bboxes: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 8.20it/s]
5.151939868927002 Total time
Detecting bboxes: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 8.18it/s]
5.162045478820801 Total time
Detecting bboxes: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 7.41it/s]
5.144781112670898 Total time
Detecting bboxes: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 7.86it/s]
5.158705711364746 Total time

float32
Detecting bboxes: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 2.57it/s]
3.202226161956787 Total time
Detecting bboxes: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 7.95it/s]
2.2406680583953857 Total time
Detecting bboxes: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 7.87it/s]
2.248495578765869 Total time
Detecting bboxes: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 7.76it/s]
2.229316234588623 Total time
Detecting bboxes: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 8.17it/s]
2.219899892807007 Total time
Detecting bboxes: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 8.09it/s]
2.1789700984954834 Total time
Detecting bboxes: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 8.17it/s]
2.1955716609954834 Total time
Detecting bboxes: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 8.09it/s]
2.230919122695923 Total time
Detecting bboxes: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 8.22it/s]
2.2963995933532715 Total time
Detecting bboxes: 100%|███████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 8.26it/s]
2.2606070041656494 Total time

@VikParuchuri
Copy link
Owner

Thanks for the detailed benchmarks! Merging now.

@VikParuchuri VikParuchuri merged commit 06cc681 into VikParuchuri:master Apr 25, 2024
@VikParuchuri
Copy link
Owner

Hi @siddhawan apologies for pinging you again, but it turns out I need people to sign a CLA for this project. You can find it here.

If you agree, please comment with "I have read the CLA Document and I hereby sign the CLA".

If you don't agree, I may have to rewrite your contribution. Thanks again for taking the time to fix this issue.

@siddhawan
Copy link
Contributor Author

I have read the CLA document and I hereby sign the CLA

Copy link
Contributor

github-actions bot commented May 2, 2024

CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅

@siddhawan
Copy link
Contributor Author

recheck

@x4080
Copy link

x4080 commented May 31, 2024

hi, i'm using mac m2, how to use gpu ? or it just work ? Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants