> ## Documentation Index
> Fetch the complete documentation index at: https://documentation.datalab.to/llms.txt
> Use this file to discover all available pages before exploring further.

# Create Document

> Generate DOCX files from markdown with track changes support.

Convert markdown to Word documents (DOCX) with support for track changes, insertions, deletions, and comments. This is useful for generating legal documents, contracts with redlines, and collaborative review documents.

## Quick Start

<CodeGroup>
  ```python Python SDK theme={null}
  from datalab_sdk import DatalabClient

  client = DatalabClient()

  markdown = (
      "# Contract\n\n"
      "This agreement is between "
      '<ins data-revision-author="Editor" data-revision-datetime="2024-01-15T10:00:00Z">'
      "Acme Corp</ins> and the client."
  )

  result = client.create_document(markdown=markdown)
  result.save_output("contract")  # saves contract.docx
  print(f"Document created: {result.page_count} page(s)")
  ```

  ```bash cURL theme={null}
  # Submit request
  curl -X POST https://www.datalab.to/api/v1/create-document \
    -H "X-API-Key: $DATALAB_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "markdown": "# Contract\n\nThis agreement is between <ins data-revision-author=\"Editor\" data-revision-datetime=\"2024-01-15T10:00:00Z\">Acme Corp</ins> and the client.",
      "output_format": "docx"
    }'

  # Poll for results (use request_check_url from response)
  curl https://www.datalab.to/api/v1/create-document/REQUEST_ID \
    -H "X-API-Key: $DATALAB_API_KEY"
  ```

  ```python Python (requests) theme={null}
  import requests, json, time, base64, os

  headers = {"X-API-Key": os.getenv("DATALAB_API_KEY")}

  # Submit request
  response = requests.post(
      "https://www.datalab.to/api/v1/create-document",
      json={
          "markdown": "# Contract\n\nThis agreement is between "
                      '<ins data-revision-author="Editor" '
                      'data-revision-datetime="2024-01-15T10:00:00Z">'
                      "Acme Corp</ins> and the client.",
          "output_format": "docx"
      },
      headers=headers
  )

  check_url = response.json()["request_check_url"]

  # Poll for results
  while True:
      result = requests.get(check_url, headers=headers).json()
      if result["status"] == "complete":
          docx_bytes = base64.b64decode(result["output_base64"])
          with open("contract.docx", "wb") as f:
              f.write(docx_bytes)
          print("Document saved as contract.docx")
          break
      elif result.get("error"):
          print(f"Error: {result['error']}")
          break
      time.sleep(2)
  ```
</CodeGroup>

## SDK Usage

Use `client.create_document()` to create a DOCX from markdown:

```python theme={null}
from datalab_sdk import DatalabClient

client = DatalabClient()

result = client.create_document(
    markdown="# Title\n\nDocument content here.",
    output_format="docx",       # Only 'docx' is supported
    webhook_url=None,           # Optional completion webhook
    save_output="output/doc",   # Optional: saves output.docx automatically
)

print(result.success)         # True if creation succeeded
print(result.page_count)      # Number of pages
print(result.cost_breakdown)  # Cost details
result.save_output("output/contract")  # Saves contract.docx
```

### SDK Method Parameters

| Parameter       | Type     | Default      | Description                                         |
| --------------- | -------- | ------------ | --------------------------------------------------- |
| `markdown`      | str      | **Required** | Markdown content with optional track changes markup |
| `output_format` | str      | `"docx"`     | Output format (only `"docx"` is supported)          |
| `webhook_url`   | str      | None         | Optional webhook URL for completion notification    |
| `save_output`   | str/Path | None         | File path to save the output DOCX                   |
| `max_polls`     | int      | `300`        | Maximum polling attempts                            |
| `poll_interval` | int      | `1`          | Seconds between polls                               |

### SDK Result Fields

| Field            | Type  | Description                         |
| ---------------- | ----- | ----------------------------------- |
| `success`        | bool  | Whether document creation succeeded |
| `status`         | str   | `"complete"` when done              |
| `output_format`  | str   | `"docx"`                            |
| `output_base64`  | str   | Base64-encoded DOCX file            |
| `runtime`        | float | Processing time in seconds          |
| `page_count`     | int   | Pages in the generated document     |
| `cost_breakdown` | dict  | Cost details                        |
| `error`          | str   | Error message if creation failed    |

## How It Works

1. Send markdown content with optional track changes markup
2. The API converts it to a DOCX file with proper Word formatting
3. Track changes tags become native Word revision marks
4. The DOCX file is returned as a base64-encoded string

## Track Changes Markup

### Insertions

Mark inserted text with `<ins>` tags:

```html theme={null}
<ins data-revision-author="John Doe" data-revision-datetime="2024-01-15T10:00:00Z">newly added text</ins>
```

| Attribute                | Required | Description                                       |
| ------------------------ | -------- | ------------------------------------------------- |
| `data-revision-author`   | Yes      | Author name for the insertion                     |
| `data-revision-datetime` | Yes      | ISO 8601 timestamp (e.g., `2024-01-15T10:00:00Z`) |

### Deletions

Mark deleted text with `<del>` tags:

```html theme={null}
<del data-revision-author="Jane Smith" data-revision-datetime="2024-01-15T11:00:00Z">removed text</del>
```

| Attribute                | Required | Description                  |
| ------------------------ | -------- | ---------------------------- |
| `data-revision-author`   | Yes      | Author name for the deletion |
| `data-revision-datetime` | Yes      | ISO 8601 timestamp           |

### Comments

Add comments with `<comment>` tags:

```html theme={null}
<comment data-comment-author="Reviewer" data-comment-datetime="2024-01-15T12:00:00Z" text="Please verify this clause">annotated text</comment>
```

| Attribute               | Required | Description                                           |
| ----------------------- | -------- | ----------------------------------------------------- |
| `data-comment-author`   | Yes      | Author/reviewer name                                  |
| `text`                  | Yes      | The comment text                                      |
| `data-comment-datetime` | No       | ISO 8601 timestamp (defaults to current time)         |
| `data-comment-initial`  | No       | Author initials (auto-generated from name if omitted) |

## Parameters

| Parameter       | Type   | Required | Description                                         |
| --------------- | ------ | -------- | --------------------------------------------------- |
| `markdown`      | string | Yes      | Markdown content with optional track changes markup |
| `output_format` | string | No       | Output format (currently only `docx` is supported)  |
| `webhook_url`   | string | No       | Webhook URL to notify when processing completes     |

## Response

The response follows the standard async pattern — submit, then poll:

**Initial response:**

```json theme={null}
{
  "success": true,
  "request_id": "abc123",
  "request_check_url": "https://www.datalab.to/api/v1/create-document/abc123"
}
```

**Final response (when polling):**

| Field            | Type   | Description                         |
| ---------------- | ------ | ----------------------------------- |
| `status`         | string | `processing` or `complete`          |
| `success`        | bool   | Whether document creation succeeded |
| `output_format`  | string | `docx`                              |
| `output_base64`  | string | Base64-encoded DOCX file            |
| `runtime`        | float  | Processing time in seconds          |
| `page_count`     | int    | Pages in the generated document     |
| `cost_breakdown` | object | Cost details                        |
| `error`          | string | Error message if creation failed    |

## Full Example

A contract with insertions, deletions, and reviewer comments:

```python theme={null}
import requests, json, time, base64, os

headers = {"X-API-Key": os.getenv("DATALAB_API_KEY")}

markdown = """# Service Agreement

## Parties

This agreement is between <ins data-revision-author="Legal" data-revision-datetime="2024-06-01T09:00:00Z">Acme Corporation ("Provider")</ins> and <del data-revision-author="Legal" data-revision-datetime="2024-06-01T09:00:00Z">the Client</del><ins data-revision-author="Legal" data-revision-datetime="2024-06-01T09:00:00Z">GlobalTech Inc. ("Client")</ins>.

## Terms

The service period begins on <comment data-comment-author="Reviewer" text="Confirm start date with finance">January 1, 2025</comment> and continues for <del data-revision-author="Legal" data-revision-datetime="2024-06-01T10:00:00Z">12</del><ins data-revision-author="Legal" data-revision-datetime="2024-06-01T10:00:00Z">24</ins> months.

## Payment

The total contract value is <ins data-revision-author="Finance" data-revision-datetime="2024-06-02T14:00:00Z">$150,000</ins> payable in quarterly installments.
"""

response = requests.post(
    "https://www.datalab.to/api/v1/create-document",
    json={"markdown": markdown, "output_format": "docx"},
    headers=headers
)

check_url = response.json()["request_check_url"]

while True:
    result = requests.get(check_url, headers=headers).json()
    if result["status"] == "complete":
        docx_bytes = base64.b64decode(result["output_base64"])
        with open("service_agreement.docx", "wb") as f:
            f.write(docx_bytes)
        print(f"Document saved ({result['page_count']} pages)")
        break
    time.sleep(2)
```

The generated DOCX file opens in Microsoft Word with native track changes visible, allowing reviewers to accept or reject each change.

## Use Cases

* **Legal document generation** — create contracts with tracked revisions
* **Contract redlining** — mark up agreements with insertions and deletions
* **Collaborative review** — add reviewer comments to documents
* **Document automation** — generate Word documents from templates with dynamic content

## Next Steps

<CardGroup cols={2}>
  <Card title="Track Changes Extraction" icon="file-diff" href="/docs/recipes/extract-redlines-and-comments/track-changes-from-word-documents">
    Extract track changes from existing Word documents
  </Card>

  <Card title="Document Conversion" icon="file-text" href="/docs/recipes/conversion/conversion-api-overview">
    Convert documents to markdown, HTML, or JSON
  </Card>

  <Card title="Webhooks" icon="bell" href="/platform/webhooks">
    Get notified when document creation completes
  </Card>

  <Card title="Pipelines" icon="workflow" href="/docs/recipes/pipelines/pipeline-overview">
    Chain processors into versioned, reusable pipelines.
  </Card>
</CardGroup>
