Our cloud-hosted API documentation can be found here. With caveats and exceptions detailed below, the container image shares the same API.

Supported endpoints

The container currently supports:
  • /api/v1/marker documented here.
  • /api/v1/ocr documented here.
It does not support /api/v1/table_rec or /api/v1/layout at the moment, but will in an upcoming release.

Authentication

API authentication is not supported in the container. We assume customers will be running our image on their own infrastructure in private networks. You may send the X-API-Key header detailed here, but it will be ignored and any value works.

use_llm is supported with your own keys

Many Datalab customers use Marker’s LLM post-processors to improve their parse output, supported by the use_llm flag in /api/v1/marker. use_llm is supported in the container image, but you must bring your own keys. The container will support any service supported by Marker, which currently includes:
  • OpenAI (via OpenAI’s platform & Azure)
  • Anthropic (via Anthropic’s platform only)
  • Gemini (via Google AI Studio & Vertex)
  • Ollama (if you want a fully on-prem setup)
If you need to ensure your data is sent only to infrastructure you control, you must use an Ollama instance that you host. The OpenAI service supports any OpenAI-compatible interface, including OpenRouter. Datalab’s API uses Gemini 2.0 Flash behind-the-scenes to power use_llm. Google’s models are best-supported. We can’t speak to performance using other models, but you’re welcome to try (and we’d love to hear from you when you do, as would folks in our Discord channel). Note: use_llmwill only work with a vision-capable model.

How to call marker with use_llm in the container

When you call /api/v1/marker in the container, you must send your keys and configuration for use_llm in the API request itself. Here’s what that looks like if you want to use Gemini 2.0 Flash (the model we use in production) via OpenRouter:
import requests

url = "https://www.datalab.to/api/v1/marker"

form_data = {
    'file': ('test.pdf', open('~/pdfs/test.pdf', 'rb'), 'application/pdf'),
    'output_format': (None, 'markdown'),
    "use_llm": (None, True), # Note `use_llm` is True
    "additional_config": (None, json.dumps({
         "llm_service": "marker.services.openai.OpenAIService",
         "openai_model": "google/gemini-2.0-flash-001",
         "openai_api_key": "<OPENROUTER_API_KEY>",
         "openai_base_url": "https://openrouter.ai/api/v1",
    }))
}

response = requests.post(url, files=form_data, headers=headers)
data = response.json()
The additional_config parameter is a dictionary of configs passed to marker itself. We allow a small subset of configuration keys on our hosted API via additional_config, but the entirety of Marker’s configs are usable in the container. Below are the keys required for each provider.

Ollama

The Ollama service supports calls via Ollama instances.
{
    "llm_service": "marker.services.ollama.OllamaService",
    "ollama_base_url": "...",
    "ollama_model": "...",
}

Gemini (via Google Vertex)

The Google Vertex service support Gemini model calls on Google Vertex.
{
    "llm_service": "marker.services.vertex.GoogleVertexService",
    "gemini_api_key": "...",
    "gemini_model_name": "...",
    "vertex_project_id": "...",
    "vertex_location": "...",
    "vertex_dedicated": "...",
}

Gemini (via Google AI Studio)

The Google Gemini service supports Gemini calls using a Google AI Studio key.
{
    "llm_service": "marker.services.gemini.GoogleGeminiService",
    "gemini_api_key": "...",
    "gemini_model_name": "...",
}

OpenAI

The OpenAI service supports any OpenAI-compatible interface by setting openai_base_url, which defaults to OpenAI’s servers.
{
    "llm_service": "marker.services.openai.OpenAIService",
    "openai_model": "...",
    "openai_api_key": "...",
    "openai_base_url": "...",
}

OpenAI (via Azure)

The Azure OpenAI service supports OpenAI model calls on Azure.
{
    "llm_service": "marker.services.azure_openai.AzureOpenAIService",
    "azure_endpoint": "...",
    "azure_api_key": "...",
    "azure_api_version": "...",
    "deployment_name": "...",
}

Claude (via Anthropic)

The Claude service supports Claude models served by Anthropic.
{
    "llm_service": "marker.services.claude.ClaudeService",
    "claude_api_key": "...",
    "claude_model_name": "...",
}

Setting up a default LLM service and keys in the container

The container doesn’t currently support setting a default LLM service with keys, but will in an upcoming release.

PDFs and images are supported, document conversion not yet supported

Datalab’s API supports many file types. The container currently supports PDFs and image file types. Other file types are not yet supported, but will be supported in an upcoming release.