Skip to main content
The form filling API fills PDF and image forms with your structured data. It works with native PDF form fields and scanned/image forms.

SDK Usage

from datalab_sdk import DatalabClient, FormFillingOptions

client = DatalabClient()

options = FormFillingOptions(
    field_data={
        "name": {"value": "John Doe", "description": "Full name"},
        "email": {"value": "[email protected]", "description": "Email address"},
        "date": {"value": "12/15/2024", "description": "Today's date"},
    }
)

result = client.fill("form.pdf", options=options)
result.save_output("filled_form.pdf")

print(f"Fields filled: {result.fields_filled}")
print(f"Fields not found: {result.fields_not_found}")
See SDK Form Filling for complete SDK documentation.

How It Works

  1. Upload your form (PDF or image) with field data
  2. The API detects form fields and matches them to your data
  3. Fields are filled and the form is returned as PDF or PNG

Field Data Format

Provide field names with values and descriptions:
field_data = {
    "field_key": {
        "value": "The value to fill",
        "description": "Description to help match the field"
    }
}

Examples

Basic fields:
field_data = {
    "first_name": {"value": "John", "description": "First name"},
    "last_name": {"value": "Doe", "description": "Last name"},
    "ssn": {"value": "123-45-6789", "description": "Social Security Number"}
}
Checkboxes:
field_data = {
    "is_citizen": {"value": "yes", "description": "US citizenship status"},
    "agree_terms": {"value": "checked", "description": "Terms agreement"}
}
Values like "yes", "true", "1", "checked", "x" will check boxes. Compound data:
field_data = {
    "full_address": {
        "value": "123 Main St, New York, NY, 10001",
        "description": "Complete address"
    }
}
The API can split compound data across multiple form fields.

Options

OptionTypeDefaultDescription
field_datadictRequiredField names mapped to values and descriptions
contextstrNoneAdditional context to help match fields
confidence_thresholdfloat0.5Minimum confidence for field matching (0.0-1.0)
max_pagesintNoneMaximum pages to process
page_rangestrNoneSpecific pages to process
skip_cacheboolFalseSkip cached results

Context Parameter

Use context to improve matching for specific form types:
options = FormFillingOptions(
    field_data={...},
    context="W-4 Employee's Withholding Certificate for new hire"
)

REST API

curl -X POST https://www.datalab.to/api/v1/fill \
  -H "X-API-Key: YOUR_API_KEY" \
  -F "[email protected]" \
  -F 'field_data={"name": {"value": "John Doe", "description": "Full name"}}'

Python Example

import requests
import json
import time
import base64

API_KEY = "YOUR_API_KEY"
headers = {"X-API-Key": API_KEY}

field_data = {
    "name": {"value": "John Doe", "description": "Full name"},
    "email": {"value": "[email protected]", "description": "Email"},
    "date": {"value": "12/15/2024", "description": "Date"}
}

# Submit request
with open("form.pdf", "rb") as f:
    response = requests.post(
        "https://www.datalab.to/api/v1/fill",
        files={"file": ("form.pdf", f, "application/pdf")},
        data={
            "field_data": json.dumps(field_data),
            "confidence_threshold": "0.5"
        },
        headers=headers
    )

check_url = response.json()["request_check_url"]

# Poll for results
while True:
    result = requests.get(check_url, headers=headers).json()

    if result["status"] == "complete":
        # Decode and save the filled form
        pdf_bytes = base64.b64decode(result["output_base64"])
        with open("filled_form.pdf", "wb") as f:
            f.write(pdf_bytes)

        print(f"Fields filled: {result['fields_filled']}")
        print(f"Fields not found: {result['fields_not_found']}")
        break
    elif result["status"] == "failed":
        print(f"Error: {result.get('error')}")
        break

    time.sleep(2)

Response

FieldTypeDescription
statusstrprocessing, complete, or failed
successboolWhether filling succeeded
output_formatstrpdf or png
output_base64strBase64-encoded filled form
fields_filledlistField names that were filled
fields_not_foundlistField names that couldn’t be matched
page_countintPages processed
runtimefloatProcessing time in seconds
cost_breakdowndictCost details

Supported Form Types

  • PDF with native AcroForm fields - Uses PDF form fields directly
  • PDF with visual fields - Detects field locations and adds text overlays
  • Images (PNG, JPG) - Detects field locations and draws text on image
The API automatically detects the input type and uses the appropriate method.
Results are deleted from Datalab servers one hour after processing completes.

Try Datalab

Get started with our API in less than a minute. We include free credits.