Prerequisites
- An active Datalab account
- An API key (get one from your account settings)
- A workflow-enabled subscription plan
- Creating a workflow
- Executing with a single file
- Executing with multiple files in parallel
- Checking execution status and retrieving results
Tutorial: Invoice Data Extraction
We’ll build a workflow that:- Parses a PDF invoice
- Extracts structured data (invoice number, vendor, amount, line items)
Create the Workflow
Define your workflow template. This is created once and can be executed many times with different files.Understanding the Structure
Parse Step:step_key: "marker_parse": Uses Marker to parse the PDFunique_name: "parse": Referenced by the extract stepmax_pages: 10: Only process first 10 pages (cost optimization)
step_key: "marker_extract": Extracts structured dataunique_name: "extract": Identifies this step in resultspage_schema: Defines what data to extractdepends_on: ["parse"]: Waits for parse to complete
workflow_id - you’ll use it to execute the workflow.
Execute with a Single File
Now execute your workflow with a single invoice:execution_id - you’ll use it to check status.
Check Execution Status
Poll the execution endpoint to track progress:While Processing
When Complete
Understanding the Results
Status Codes:PENDING: Queued, not started yetIN_PROGRESS: Steps are runningCOMPLETED: All steps finished successfullyFAILED: An error occurred
Execute with Multiple Files
The same workflow can process multiple invoices in parallel:Try it out
- Use Conditional Logic: Explore Conditional Routing