The form filling API fills PDF and image forms with your structured data. It works with native PDF form fields and scanned/image forms.
SDK Usage
from datalab_sdk import DatalabClient, FormFillingOptions
client = DatalabClient()
options = FormFillingOptions(
field_data={
"name": {"value": "John Doe", "description": "Full name"},
"email": {"value": "[email protected]", "description": "Email address"},
"date": {"value": "12/15/2024", "description": "Today's date"},
}
)
result = client.fill("form.pdf", options=options)
result.save_output("filled_form.pdf")
print(f"Fields filled: {result.fields_filled}")
print(f"Fields not found: {result.fields_not_found}")
See SDK Form Filling for complete SDK documentation.
How It Works
- Upload your form (PDF or image) with field data
- The API detects form fields and matches them to your data
- Fields are filled and the form is returned as PDF or PNG
Provide field names with values and descriptions:
field_data = {
"field_key": {
"value": "The value to fill",
"description": "Description to help match the field"
}
}
Examples
Basic fields:
field_data = {
"first_name": {"value": "John", "description": "First name"},
"last_name": {"value": "Doe", "description": "Last name"},
"ssn": {"value": "123-45-6789", "description": "Social Security Number"}
}
Checkboxes:
field_data = {
"is_citizen": {"value": "yes", "description": "US citizenship status"},
"agree_terms": {"value": "checked", "description": "Terms agreement"}
}
Values like "yes", "true", "1", "checked", "x" will check boxes.
Compound data:
field_data = {
"full_address": {
"value": "123 Main St, New York, NY, 10001",
"description": "Complete address"
}
}
The API can split compound data across multiple form fields.
Options
| Option | Type | Default | Description |
|---|
field_data | dict | Required | Field names mapped to values and descriptions |
context | str | None | Additional context to help match fields |
confidence_threshold | float | 0.5 | Minimum confidence for field matching (0.0-1.0) |
max_pages | int | None | Maximum pages to process |
page_range | str | None | Specific pages to process |
skip_cache | bool | False | Skip cached results |
Context Parameter
Use context to improve matching for specific form types:
options = FormFillingOptions(
field_data={...},
context="W-4 Employee's Withholding Certificate for new hire"
)
REST API
curl -X POST https://www.datalab.to/api/v1/fill \
-H "X-API-Key: YOUR_API_KEY" \
-F "[email protected]" \
-F 'field_data={"name": {"value": "John Doe", "description": "Full name"}}'
Python Example
import requests
import json
import time
import base64
API_KEY = "YOUR_API_KEY"
headers = {"X-API-Key": API_KEY}
field_data = {
"name": {"value": "John Doe", "description": "Full name"},
"email": {"value": "[email protected]", "description": "Email"},
"date": {"value": "12/15/2024", "description": "Date"}
}
# Submit request
with open("form.pdf", "rb") as f:
response = requests.post(
"https://www.datalab.to/api/v1/fill",
files={"file": ("form.pdf", f, "application/pdf")},
data={
"field_data": json.dumps(field_data),
"confidence_threshold": "0.5"
},
headers=headers
)
check_url = response.json()["request_check_url"]
# Poll for results
while True:
result = requests.get(check_url, headers=headers).json()
if result["status"] == "complete":
# Decode and save the filled form
pdf_bytes = base64.b64decode(result["output_base64"])
with open("filled_form.pdf", "wb") as f:
f.write(pdf_bytes)
print(f"Fields filled: {result['fields_filled']}")
print(f"Fields not found: {result['fields_not_found']}")
break
elif result["status"] == "failed":
print(f"Error: {result.get('error')}")
break
time.sleep(2)
Response
| Field | Type | Description |
|---|
status | str | processing, complete, or failed |
success | bool | Whether filling succeeded |
output_format | str | pdf or png |
output_base64 | str | Base64-encoded filled form |
fields_filled | list | Field names that were filled |
fields_not_found | list | Field names that couldn’t be matched |
page_count | int | Pages processed |
runtime | float | Processing time in seconds |
cost_breakdown | dict | Cost details |
- PDF with native AcroForm fields - Uses PDF form fields directly
- PDF with visual fields - Detects field locations and adds text overlays
- Images (PNG, JPG) - Detects field locations and draws text on image
The API automatically detects the input type and uses the appropriate method.
Results are deleted from Datalab servers one hour after processing completes.
Try Datalab
Get started with our API in less than a minute. We include free credits.