Pipeline Versioning

Before you begin, make sure you have:

A Datalab account with an API key (new accounts include $5 in free credits)
Python 3.10+ installed
The Datalab SDK: pip install datalab-python-sdk
Your DATALAB_API_KEY environment variable set

Version Lifecycle

Every pipeline goes through a predictable lifecycle:

State	`active_version`	Description
Draft	`0`	Edits auto-save. No published version yet.
Saved	`0`	Named pipeline, still no published version.
Published	`1`, `2`, …	Immutable version snapshots exist.

When you edit a published pipeline, your changes go into a draft. The published version is untouched until you explicitly publish again.

Publish a Version

Create an immutable snapshot of the current pipeline steps:

from datalab_sdk import DatalabClient

client = DatalabClient()

# Publish version 1
version = client.create_pipeline_version(
    "pl_abc123",
    description="Initial production release"
)
print(f"Published v{version.version}")  # v1

Each call increments the version number. Published versions are immutable — their steps cannot be changed.

Edit and Iterate

After publishing, any edits create a draft that is separate from the published version:

from datalab_sdk import PipelineProcessor

# Edit steps — this creates a draft
client.update_pipeline("pl_abc123", steps=[
    PipelineProcessor(type="convert", settings={"mode": "accurate"}),  # Changed
    PipelineProcessor(type="extract", settings={
        "page_schema": {"type": "object", "properties": {
            "title": {"type": "string"},
            "author": {"type": "string"}  # Added field
        }}
    })
])

# Test the draft
execution = client.run_pipeline("pl_abc123", file_path="test.pdf", version=0)

# Happy with changes? Publish a new version
version = client.create_pipeline_version("pl_abc123", description="Added author field")
print(f"Published v{version.version}")  # v2

version=0 explicitly runs the draft. Omitting version runs the active published version. See Run a Pipeline for version parameter details.

Discard a Draft

Revert unsaved changes and restore the published version’s steps:

# Discard draft, revert to active version
pipeline = client.discard_pipeline_draft("pl_abc123")

# Or revert to a specific version
pipeline = client.discard_pipeline_draft("pl_abc123", version=1)

Browse Version History

List all published versions for a pipeline:

result = client.list_pipeline_versions("pl_abc123")

for v in result["versions"]:
    print(f"v{v.version}: {v.description} (created {v.created})")
    print(f"  Steps: {[s['type'] for s in v.steps]}")

Versions are returned newest-first.

Best Practices

Pin production integrations to a specific version. When calling run_pipeline() from production code, pass an explicit version number. This protects you from accidental changes:

# Production code — pinned to v2
execution = client.run_pipeline(
    "pl_abc123",
    file_path="document.pdf",
    version=2  # Always runs v2, even if v3 is published later
)

Test drafts before publishing. Use version=0 to run the draft version against test documents:

# Test draft changes
execution = client.run_pipeline(
    "pl_abc123",
    file_path="test_document.pdf",
    version=0  # Runs draft
)

Use descriptions. Include a meaningful description when publishing so your team can understand what changed:

client.create_pipeline_version(
    "pl_abc123",
    description="Switch to accurate mode, add line_items extraction"
)

Archive unused pipelines. Keep your pipeline list clean:

client.archive_pipeline("pl_old123")

# List includes archived if you need them
result = client.list_pipelines(include_archived=True)

Next Steps

Run a Pipeline

Execute pipelines with version selection, overrides, and polling.

Create a Pipeline

Build pipelines with Forge or the SDK.

Pipeline Overview

Processor types, composition rules, and when to use pipelines.

SDK Reference

Full SDK reference for all pipeline methods.

General

Document Conversion

Structured Extraction

Document Segmentation

Form Filling

File Management

Pipelines

Create Document

Track Changes

Table Recognition (Deprecated)

Forge Evals

Version Lifecycle

Publish a Version

Edit and Iterate

Discard a Draft

Browse Version History

Best Practices

Next Steps

Run a Pipeline

Create a Pipeline

Pipeline Overview

SDK Reference

General

Document Conversion

Structured Extraction

Document Segmentation

Form Filling

File Management

Pipelines

Create Document

Track Changes

Table Recognition (Deprecated)

Forge Evals

Documentation Index

​Version Lifecycle

​Publish a Version

​Edit and Iterate

​Discard a Draft

​Browse Version History

​Best Practices

​Next Steps

Run a Pipeline

Create a Pipeline

Pipeline Overview

SDK Reference

Version Lifecycle

Publish a Version

Edit and Iterate

Discard a Draft

Browse Version History

Best Practices

Next Steps