> ## Documentation Index
> Fetch the complete documentation index at: https://documentation.datalab.to/llms.txt
> Use this file to discover all available pages before exploring further.

# File Upload

> Upload and manage files for use in pipelines and document processing.

Upload files to Datalab storage and reference them across API calls and pipelines.

## SDK Usage

The SDK handles the upload flow automatically:

```python theme={null}
from datalab_sdk import DatalabClient

client = DatalabClient()

# Upload a single file
file = client.upload_files("document.pdf")
print(f"Uploaded: {file.reference}")  # datalab://file-abc123

# Upload multiple files
files = client.upload_files(["doc1.pdf", "doc2.pdf", "doc3.pdf"])
for f in files:
    print(f"{f.original_filename}: {f.reference}")
```

### Use in Pipelines

```python theme={null}
# Upload files
files = client.upload_files(["invoice1.pdf", "invoice2.pdf"])

# Use in pipeline
for f in files:
    execution = client.run_pipeline("pl_abc123", file_url=f.reference)
```

### File Management

```python theme={null}
# List files
result = client.list_files(limit=50)
for file in result['files']:
    print(f"{file.original_filename}: {file.file_size} bytes")

# Get metadata
file = client.get_file_metadata(123)

# Get download URL
download = client.get_file_download_url(file_id=123, expires_in=3600)
print(download['download_url'])

# Delete file
client.delete_file(123)
```

See [SDK File Management](/docs/welcome/sdk/file-management) for complete documentation.

## REST API

The upload flow has three steps:

### 1. Request Upload URL

```bash theme={null}
curl -X POST https://www.datalab.to/api/v1/files/upload \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"filename": "document.pdf", "content_type": "application/pdf"}'
```

To store the file in EU infrastructure, add `"processing_location": "eu"` to the request body:

```bash theme={null}
curl -X POST https://www.datalab.to/api/v1/files/upload \
  -H "X-API-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"filename": "document.pdf", "content_type": "application/pdf", "processing_location": "eu"}'
```

Response:

```json theme={null}
{
  "file_id": 123,
  "upload_url": "https://presigned-url...",
  "expires_in": 3600,
  "reference": "datalab://file-abc123"
}
```

### 2. Upload File

```bash theme={null}
curl -X PUT "{upload_url}" \
  -H "Content-Type: application/pdf" \
  --data-binary @document.pdf
```

### 3. Confirm Upload

```bash theme={null}
curl https://www.datalab.to/api/v1/files/123/confirm \
  -H "X-API-Key: YOUR_API_KEY"
```

### Complete Python Example

```python theme={null}
import requests

API_KEY = "YOUR_API_KEY"
headers = {"X-API-Key": API_KEY}

# Step 1: Request upload URL
response = requests.post(
    "https://www.datalab.to/api/v1/files/upload",
    json={"filename": "document.pdf", "content_type": "application/pdf"},
    headers=headers
)
data = response.json()
file_id = data["file_id"]
upload_url = data["upload_url"]
reference = data["reference"]

# Step 2: Upload file
with open("document.pdf", "rb") as f:
    requests.put(upload_url, data=f, headers={"Content-Type": "application/pdf"})

# Step 3: Confirm upload
requests.get(f"https://www.datalab.to/api/v1/files/{file_id}/confirm", headers=headers)

print(f"File ready: {reference}")
```

## File Management API

### List Files

```bash theme={null}
GET /api/v1/files?limit=50&offset=0
```

### Get File Metadata

```bash theme={null}
GET /api/v1/files/{file_id}
```

### Get Download URL

```bash theme={null}
GET /api/v1/files/{file_id}/download?expires_in=3600
```

### Delete File

```bash theme={null}
DELETE /api/v1/files/{file_id}
```

## Using File References

Once uploaded, use `datalab://file-{id}` references in any API call:

```python theme={null}
# In Convert API
response = requests.post(
    "https://www.datalab.to/api/v1/convert",
    data={
        "file_url": "datalab://file-abc123",
        "output_format": "markdown",
        "mode": "balanced"
    },
    headers=headers
)

# In Form Filling API
response = requests.post(
    "https://www.datalab.to/api/v1/fill",
    data={
        "file_url": "datalab://file-abc123",
        "field_data": json.dumps(field_data)
    },
    headers=headers
)
```

## Limits

| Limit               | Value                |
| ------------------- | -------------------- |
| Maximum file size   | 200 MB               |
| Upload URL expiry   | 1 hour               |
| Download URL expiry | 1 minute to 24 hours |

See [API Limits](/docs/common/limits) for complete details.

<Card title="Try Datalab" icon="rocket" href="https://www.datalab.to/auth/sign_up">
  Get started with our API in less than a minute. We include free credits.
</Card>
