Pull tables out of PDFs, websites, and other documents
/api/v1/table_rec
.
Here is an example request in Python:
multipart/form-data
.
A note on parameters:
use_llm
, but depending on your accuracy / latency tradeoffs, it’s worth testing enabling it for the types of documents you care about. Enabling it can often fix issues with dense tables.force_ocr
will slow down inference, but can fix rendering issues, e.g. with ligatures in text.output_format
is useful to get your table out as html
, json
, or markdown
. Note that if you use json
, it’ll include Table
and TableCell
blocks where available, including their bounding boxes.request_check_url
, like this to check your Table Recognition Result:
status
field will be set to complete
, and you will get a response containing your identified tables. They’ll be within a key corresponding to your value for output_format
, i.e. html | json | markdown
.