What Are Workflows?
A workflow is a reusable template that defines a series of document processing steps. Each workflow:- Consists of multiple steps that execute in order based on dependencies
- Can process single or multiple documents in parallel
- Passes data between steps automatically
- Handles errors gracefully with per-file isolation
Key Concepts
Steps
A step is a single operation in your workflow (parse, extract, segment). Steps are defined with:- step_key: The type of operation (
marker_parse,marker_extract, etc.) - unique_name: A unique identifier for this step within the workflow
- settings: Configuration specific to the step type
- depends_on: List of step names that must complete before this step runs
Dependencies
Steps execute in order based on their depends_on array. Multiple steps can depend on the same parent and will execute in parallel once dependencies are satisfied.Execution Flow
- Create workflow: Define the template once
- Execute workflow: Run it with specific files (single or multiple)
- Track progress: Poll execution status
- Get results: Retrieve structured outputs organized by step and file
Available Step Types
You can get a complete list of available step types programmatically:marker_parse
Parse PDF documents into markdown format. Automatically saves a checkpoint when followed by extraction or segmentation steps. Common Settings:checkpoint_id: Reference for downstream extraction/segmentationlookup_key: Request ID for checking parse resultsresult: Full parse output including markdown content
- Convert PDFs to markdown for downstream processing
- Create checkpoints for multiple extraction attempts
- Parse documents with different quality settings
marker_extract
Extract structured data from parsed documents using LLM-powered extraction. Common Settings:- Structured data matching your schema
lookup_key: Request ID for checking extraction results
checkpoint_id from the most recent parse step in the dependency chain.
Use Cases:
- Extract invoice data (amounts, dates, line items)
- Pull document metadata (title, author, date)
- Parse forms and structured documents
- Extract table data into JSON
marker_segment
Segment documents into logical sections using LLM-powered detection. Common Settings:- Identified segments with page ranges
- Segment metadata and content
- Split research papers by section
- Identify document structure (header, body, footer)
- Find specific sections in long documents
- Break reports into chapters
conditional
Make routing decisions based on step outputs or document properties. Enable different downstream steps based on conditions. Common Settings:>=, <=, =, !=, >, <
Logic: AND, OR
Output:
condition_result: Boolean result of the evaluationenabled_steps: Array of steps enabled by this route
- Re-parse low-quality documents with OCR
- Route invoices above a threshold to detailed extraction
- Skip extraction for empty pages
- Apply different processing based on document type
await_parse_quality
Wait for parse quality scoring to complete. Used before conditional routing based on quality. Common Settings:parse_quality_score: Quality score from 0-5quality_metadata: Additional quality information (OCR detection, page count)
marker_parse step. Quality scores typically available within 30-60 seconds.
Multi-File Processing
Workflows automatically handle multiple files efficiently:Parallel Execution
When you provide multiple files:- Each file is processed independently through all
single_inputsteps - Files execute in parallel (not sequentially)
- Results are organized by file ID
- If one file fails, others continue processing
Performance
- Total execution time ≈ time for slowest file (not sum of all files)
- No limit on number of files (subject to plan limits)
- Each file counts as a separate request for rate limiting
Example Flow
Execution Lifecycle
Status Progression
- PENDING: Execution created, queued to start
- IN_PROGRESS: Steps are actively running
- COMPLETED: All steps finished successfully
- FAILED: One or more critical errors occurred
Tracking Progress
Poll the execution endpoint to track real-time progress:- Current status
- Completed steps and their outputs
- In-progress steps
- Any errors encountered
Error Handling
Per-File Isolation
In multi-file workflows, errors are isolated:- File A fails → Files B and C continue processing
- Check individual file results in
step_outputs - Execution status reflects overall state
Error Responses
Failed steps include error details:Billing
Workflows are currently in Beta, until they hit public release, there is no added cost for Workflows. You will still have to pay for the underlying Marker API requests in line with our API billingTry it out
- Create Your First Workflow: Follow the Simple Workflow Tutorial
- Learn Conditional Logic: Explore Conditional Routing