- A Datalab account with an API key (new accounts include $5 in free credits)
- Python 3.10+ installed
- The Datalab SDK:
pip install datalab-python-sdk - Your
DATALAB_API_KEYenvironment variable set
Overview
Saved Schemas let you store extraction schemas in Datalab and reference them by ID (schema_id) when calling /api/v1/extract. Instead of sending a full JSON schema with every request, you save it once and reference it by its stable ID.
Saved schemas also support versioning — you can update a schema while keeping a history of previous versions and pin extractions to a specific version using schema_version.
Create a Schema
Create and manage extraction schemas in the Datalab UI. Each schema is assigned aschema_id (e.g. sch_k8Hx9mP2nQ4v) that you can reference in extraction requests.
Extract Using a Saved Schema
Passschema_id to /api/v1/extract instead of page_schema:
Schema Versioning
When you update a schema in the Datalab UI, you can choose to create a new version. This saves the current state to version history and increments the version number.Pin to a specific version
Passschema_version alongside schema_id to use a specific version:
schema_version always uses the latest version.
List Schemas
cURL
schemas (array) and total (count). Schemas are ordered by creation date, newest first.
Get a Schema
Archive a Schema
Archiving soft-deletes a schema — it no longer appears in list results (unlessinclude_archived=true) and cannot be used for new extractions:
API Reference
Schema Object
| Field | Type | Description |
|---|---|---|
schema_id | string | Stable string ID (e.g. sch_k8Hx9mP2nQ4v) |
name | string | Human-readable name (max 200 chars) |
description | string|null | Optional description |
schema_json | object | JSON schema with a properties key |
version | int | Current version number (starts at 1) |
version_history | array | Previous versions saved with create_new_version: true |
archived | bool | Whether the schema is archived |
created | datetime | Creation timestamp |
updated | datetime | Last update timestamp |
/extract Parameters (schema-related)
| Parameter | Type | Description |
|---|---|---|
schema_id | string | ID of a saved schema. Mutually exclusive with page_schema. |
schema_version | int | Version to use. Only valid with schema_id. Defaults to latest. |
Next Steps
Structured Extraction
Full guide to extraction with inline schemas, checkpoints, and options.
Confidence Scoring
Score extraction results with per-field confidence ratings.
Forge Evals
Compare extraction results across configurations using saved schemas.
Handling Long Documents
Strategies for extracting from 100+ page documents.