Bug #63

wangwenqi567 · 2025-03-19T06:42:01Z

使用rapidtable识别 5 行 5 列的空表格，由于文本识别为空，进到 table_engine 会报错，即使有 ocr_engine，也会报错，进入到 get_boxes_recs 函数，但是 ocr_res 是None
File "/usr/local/lib/python3.10/dist-packages/rapid_table/main.py", line 130, in get_boxes_recs
box = np.array(box)
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (3,) + inhomogeneous part
代码
from pathlib import Path
from rapidocr_paddle import RapidOCR
from rapid_table import RapidTable, RapidTableInput

ROOT_URL = "../models/"
model_path = {
"encoder": f"{ROOT_URL}/unitable/encoder.pth",
"decoder": f"{ROOT_URL}/unitable/decoder.pth",
"vocab": f"{ROOT_URL}/unitable/vocab.json",
}
input_args = RapidTableInput(model_type="unitable", model_path=model_path, use_cuda=True, device="cuda:0")
table_engine = RapidTable(input_args)
ocr_engine = RapidOCR(
det_use_cuda=True,
cls_use_cuda=True,
rec_use_cuda=True,
)

ocr_res = ocr_engine(image_data)
print(f"ocr_res: {ocr_res}")
table_results = table_engine(image_data, ocr_res)
table_html_str, table_cell_bboxes = table_results.pred_html, table_results.cell_bboxes
print(table_html_str)
return table_html_str

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug #63

Bug #63

wangwenqi567 commented Mar 19, 2025

Bug #63

Bug #63

Comments

wangwenqi567 commented Mar 19, 2025