Skip to main content

Installation

Install the Datalab SDK:
pip install datalab-python-sdk
Get your API key from datalab.to. Set it as an environment variable or pass it directly to the client.
export DATALAB_API_KEY=your_api_key_here

Convert a Document

The SDK provides a simple interface to convert documents to Markdown, HTML, JSON, or chunks.
from datalab_sdk import DatalabClient

client = DatalabClient()  # Uses DATALAB_API_KEY env var

# Convert PDF to markdown
result = client.convert("document.pdf")
print(result.markdown)

# Save output and images
result.save_output("output/")

Conversion Options

Control the conversion with options:
from datalab_sdk import DatalabClient, ConvertOptions

client = DatalabClient()

options = ConvertOptions(
    output_format="markdown",  # "markdown", "html", "json", "chunks"
    mode="balanced",           # "fast", "balanced", "accurate"
    paginate=True,             # Add page delimiters
    page_range="0-10",         # Process specific pages (0-indexed)
)

result = client.convert("document.pdf", options=options)

Processing Modes

ModeDescription
fastLowest latency, good for simple documents
balancedBalance of speed and accuracy (default)
accurateHighest accuracy, best for complex layouts

Fill PDF Forms

Fill forms in PDFs or images with structured data:
from datalab_sdk import DatalabClient, FormFillingOptions

client = DatalabClient()

options = FormFillingOptions(
    field_data={
        "full_name": {"value": "John Doe", "description": "Full legal name"},
        "date": {"value": "2024-01-15", "description": "Today's date"},
        "signature": {"value": "John Doe", "description": "Signature field"},
    }
)

result = client.fill("form.pdf", options=options)
result.save_output("filled_form.pdf")

Upload and Manage Files

Upload files to Datalab for use in workflows:
from datalab_sdk import DatalabClient

client = DatalabClient()

# Upload files
uploaded = client.upload_files(["doc1.pdf", "doc2.pdf"])
for file in uploaded:
    print(f"{file.original_filename}: {file.reference}")
    # Output: doc1.pdf: datalab://file-abc123

# List your files
files = client.list_files(limit=50)
print(f"Total files: {files['total']}")

CLI

The SDK includes a command-line interface:
# Convert a single document
datalab convert document.pdf --output_format markdown

# Convert with options
datalab convert document.pdf --mode accurate --paginate

# Convert a directory
datalab convert ./documents/ --output_dir ./output/

Async Support

For high-throughput applications, use the async client:
import asyncio
from datalab_sdk import AsyncDatalabClient

async def convert_documents():
    async with AsyncDatalabClient() as client:
        result = await client.convert("document.pdf")
        print(result.markdown)

asyncio.run(convert_documents())

Next Steps

Try Datalab

Get started with our API in less than a minute. We include free credits.