Skip to main content
Score your structured extraction results to get per-field confidence ratings (1–5) with reasoning that explains what evidence was found or missing.
Extraction scoring is in beta.We’d love your feedback — reach out at support@datalab.to.Scoring is free.
Before you begin, make sure you have:
  1. A Datalab account with an API key (new accounts include $5 in free credits)
  2. Python 3.10+ installed
  3. The Datalab SDK: pip install datalab-python-sdk
  4. Your DATALAB_API_KEY environment variable set

How It Works

Scoring runs automatically after every extraction. When you poll request_check_url, the extraction result initially contains just the extracted fields and citations. If you continue polling the same URL, the response will eventually include _score fields and an extraction_score_average once scoring completes. Each scored field receives:
  • A score from 1 (very low confidence) to 5 (high confidence)
  • A reasoning string explaining what evidence supports or undermines the extracted value
No extra parameters or endpoints are needed — just keep polling until scores appear.

Example

import requests, json, time, os

headers = {"X-API-Key": os.getenv("DATALAB_API_KEY")}

schema = {
    "type": "object",
    "properties": {
        "invoice_number": {"type": "string", "description": "Invoice ID or number"},
        "total_amount": {"type": "number", "description": "Total amount due"},
        "vendor_name": {"type": "string", "description": "Vendor or company name"}
    },
    "required": ["invoice_number", "total_amount"]
}

with open("invoice.pdf", "rb") as f:
    resp = requests.post(
        "https://www.datalab.to/api/v1/extract",
        files={"file": ("invoice.pdf", f, "application/pdf")},
        data={
            "page_schema": json.dumps(schema),
            "mode": "balanced"
        },
        headers=headers
    )
check_url = resp.json()["request_check_url"]

# Poll until extraction is complete
while True:
    result = requests.get(check_url, headers=headers).json()
    if result["status"] == "complete":
        extracted = json.loads(result["extraction_schema_json"])
        print("Extraction:", extracted)
        break
    time.sleep(2)

# Continue polling — scores are enriched asynchronously
while "extraction_score_average" not in result:
    time.sleep(2)
    result = requests.get(check_url, headers=headers).json()

scored = json.loads(result["extraction_schema_json"])
for key, value in scored.items():
    if key.endswith("_score"):
        field = key.replace("_score", "")
        print(f"{field}: score={value['score']}, reasoning={value['reasoning']}")

Response Format

Without scoring, extraction_schema_json contains fields and citations:
{
  "invoice_number": "INV-2024-001",
  "invoice_number_citations": ["block_123"],
  "total_amount": 1500.00,
  "total_amount_citations": ["block_456"]
}
With scoring, each field also gets a _score object, and the top-level response includes an extraction_score_average:
{
  "invoice_number": "INV-2024-001",
  "invoice_number_citations": ["block_123"],
  "invoice_number_score": {
    "score": 5,
    "reasoning": "Value found verbatim in the document header with a matching citation."
  },
  "total_amount": 1500.00,
  "total_amount_citations": ["block_456"],
  "total_amount_score": {
    "score": 4,
    "reasoning": "Amount found in the totals row; minor ambiguity due to a subtotal nearby."
  }
}
The top-level response also includes extraction_score_average (4.5 in this case), averaging all field scores.

Score Rubric

ScoreMeaning
5High confidence — clear match with strong citation support
4Good confidence — match found with minor ambiguity
3Moderate confidence — partial match or uncertain citation
2Low confidence — match is inferred or weakly supported
1Very low confidence — no clear evidence found

Using Scores in Practice

Use extraction_score_average for a quick quality check, then inspect individual _score fields to flag low-confidence results:
import json

# After getting scored result (from either approach)
avg = result["extraction_score_average"]
print(f"Average score: {avg}")

scored = json.loads(result["extraction_schema_json"])
for key, value in scored.items():
    if not key.endswith("_score"):
        continue

    field = key.replace("_score", "")
    if value["score"] <= 2:
        print(f"Low confidence for '{field}': {value['reasoning']}")
    elif value["score"] >= 4:
        print(f"'{field}' = {scored[field]}")
This is useful for building review workflows — auto-accept high-confidence fields and route low-confidence ones to a human reviewer.

Next Steps

Structured Extraction

Full extraction API reference and schema examples

Handling Long Documents

Strategies for extracting from 100+ page documents

Pipelines

Chain processors into versioned, reusable pipelines.

Document Conversion

Convert documents to various formats