Skip to main content
This guide helps you migrate from deprecated Datalab API endpoints to their current replacements.

Table Recognition → Document Conversion

The standalone Table Recognition endpoint (/api/v1/table_rec) is deprecated. Use the Document Conversion endpoint with JSON output instead.

Before (deprecated)

# Old: Dedicated table recognition endpoint
response = requests.post(
    "https://www.datalab.to/api/v1/table_rec",
    files={"file": ("doc.pdf", f, "application/pdf")},
    headers={"X-API-Key": API_KEY}
)

After (current)

from datalab_sdk import DatalabClient, ConvertOptions

client = DatalabClient()

options = ConvertOptions(
    output_format="json",
    mode="balanced"
)

result = client.convert("document.pdf", options=options)

# Tables are in the JSON output with block_type "Table"
for block in result.json.get("children", []):
    if block.get("block_type") == "Table":
        print(f"Table: {block['id']}")
        print(f"Bounding box: {block['bbox']}")
        # Access cells in block['children']

Why migrate?

  • Single endpoint — no need for separate table-specific calls
  • Better accuracy — tables are extracted in context with the full document
  • More features — access processing modes, structured extraction, and more
  • Active development — the marker endpoint receives all new improvements

OCR → Document Conversion

The standalone OCR endpoint (/api/v1/ocr) is deprecated. Use the Document Conversion endpoint instead, which includes OCR as part of its processing pipeline.

Before (deprecated)

# Old: Dedicated OCR endpoint
response = requests.post(
    "https://www.datalab.to/api/v1/ocr",
    files={"file": ("doc.pdf", f, "application/pdf")},
    headers={"X-API-Key": API_KEY}
)

After (current)

from datalab_sdk import DatalabClient, ConvertOptions

client = DatalabClient()

# For text extraction, use markdown output
result = client.convert("document.pdf")
print(result.markdown)

# For page-level text, use JSON output
options = ConvertOptions(output_format="json")
result = client.convert("document.pdf", options=options)

Why migrate?

  • Better results — the conversion pipeline includes OCR plus layout analysis, table recognition, and image handling
  • More output formats — get markdown, HTML, JSON, or chunks instead of raw text
  • Quality scoringparse_quality_score tells you how well the document was processed

Next Steps