Changelog

8/5/2025

Launch a new OCR model with improved math performance.
Improve marker quality in cases where there are inline equations or other text that needs OCR.

7/25/2025

Improve speed of LLM mode and when outputting multiple output formats.

7/20/2025

Launch a visual editor for structured extraction that lets you edit schemas and visualize results.

7/15/2025

Add a visual editor for marker prompts that lets you see how the document was changed, test across documents, and save prompts.

7/1/2025

Structured extraction beta - pass page_schema to the marker endpoint to extract structured data from documents. The schema should be a pydantic schema generated with .model_dump_json_schema(), or another JSON schema format.
Support the new chunks output format for marker, which is a simplified list of blocks with their full html, ideal for chunking/RAG.
Marker endpoint is now promptable - pass block_correction_prompt to the marker endpoint to correct the output of marker with your custom logic.
We support additional configuration parameters for marker via the additional_config parameter. This is a JSON object where the keys are the configuration options and the values are the values for those options. You can see the exact options in the API schema.

6/26/2025

Support multiple output formats for one doc by passing them as comma-separated values in output_format for marker.
Complete redesign of the dashboard, with a new look and feel. This will also make it easier for us improve functionality in the future.

6/18/2025

Improve the playground to make it more functional (easier to test options)
Significantly improve styling in the playground
Add a public version of the playground to make marker easier to test

6/3/2025

Initial launch of playground, for testing marker parsing configurations

5/27/2025

New OCR model which benchmarks better overall, handles inline math, gives detailed character bboxes.
Add format_lines flag to marker to add inline math and formatting to lines. (this will automatically OCR lines that need it, also)

3/26/2025

Add support for multiple file formats - spreadsheets, epub, html, in addition to existing document, image, pdf, and presentation formats.
Improve inline math and formatting when passing use_llm.
use_llm (the high accuracy mode) now costs the same as regular inference.

1/30/2025

Marker:

Integrate a new table recognition model, which handles rowspans and colspans better. This is a significant improvement on the old model.
Improve the --use_llm option to merge tables across pages, OCR handwriting, OCR forms, and generally have much higher quality than before.
Integrate a new LaTeX OCR model that is significantly more accurate.
Add links and references to the markdown - the references include internal links.

General:

Speed up inference time.
Remove the line detection endpoint - it had low usage.
Improve the table_rec endpoint - it now takes the --use_llm flag, and should run much faster.

1/3/2025

Add the use_llm option to the marker API - this uses an LLM to make conversion much more accurate for tables, forms, inline math, and complex pages. It’s a beta feature, and will currently double the cost per request.
Added other options to the marker endpoint.
- Use disable_image_extraction to disable image extraction for marker.
- Use strip_existing_ocr to strip all existing OCR text and re-OCR (if it was added by something like tesseract)
Better automatic heuristics for when to OCR with marker.
Better text extraction and layout detection for marker.
Speed up the marker and OCR endpoints by ~30%.

12/4/2024

Uploaded files can now be up to 200MB in size.
Improved speed by optimizing file handling on the backend.

12/3/2024

We now offer $5 in free credits to new signups
Additional bugfixes to improve markdown output quality

12/2/2024

We sped up file operations internally, which should result in a decent API speed boost
We now handle blockquotes and nested lists with the marker endpoint

11/27/2024

Marker is now at v1, with a lot of improvements - it’s 4x faster than a month ago, and quality is much higher across all document types
The layout model has been upgraded to a new version, with more potential prediction types

10/31/2024

More API speedups, on the order of 15-20% for marker.
Bump concurrency/rate limits to 200.
Improve stability of service under load.
If you cancel, you will now retain your credits until the end of the month.
Visual improvements on the marketing site.

10/28/2024

Significant API speedups, on the order of 40% faster.

10/25/24

Flatten form fields into pdf when extracting tables and markdown
Fix page separators, they now appear at the start of every page, and include a page number

10/23/24

Speed up marker, layout, and detection by 20-30%
Fix various bugs that cause edge case errors in conversion
Increase concurrent request limit to 100

10/21/24

Significantly improve marker output quality
- Include header levels like h1, h2, etc.
- Parse tables very accurately
- Improve block type detection and markdown quality
- Fix many output bugs
Add in new table recognition model at the /table_rec endpoint
- This will detect and convert tables into a given format
Improve OCR, layout, text detection quality
Fix memory leaks and improve performance
Fix bugs with pagination and marker

8/19/2024

Add in new OCR model with better accuracy across the board
Language is now optional for marker and OCR model
Increase max page count and max pixel width

7/20/2024

Drop prices for marker and surya inference.

7/12/2024

Significant speedup for marker and surya text detection/layout. 10-15% faster.

7/10/2024

Increase concurrent request limit to 50.

7/6/2024

Major infrastructure stability improvements.

7/3/2024

Added response caching for up to 1 hour. If you send the same document to the same endpoint, with the same options, within that time, you’ll get a cache hit and won’t be billed again.

7/2/2024

Improved parsing for Powerpoint presentations and Word documents.
Add status page and changelog.

6/26/2024

Increase concurrency limits for all users

6/25/2024

Return page count from all endpoints
Users can now disable marker image extraction
Webhooks are now supported instead of polling. Webhooks will ping a given URL when inference is complete.

6/21/2024

Initial support for Microsoft Word and Microsoft Powerpoint documents (docx/doc/pptx/ppt).

6/18/2024

Enable paginating marker output.

5/31/2024

Initial launch of marker and surya APIs.

Welcome

Self-serve On Prem

Platform

Beta

8/5/2025

7/25/2025

7/20/2025

7/15/2025

7/1/2025

6/26/2025

6/18/2025

6/3/2025

5/27/2025

3/26/2025

1/30/2025

1/3/2025

12/4/2024

12/3/2024

12/2/2024

11/27/2024

10/31/2024

10/28/2024

10/25/24

10/23/24

10/21/24

8/19/2024

7/20/2024

7/12/2024

7/10/2024

7/6/2024

7/3/2024

7/2/2024

6/26/2024

6/25/2024

6/21/2024

6/18/2024

5/31/2024

Welcome

Self-serve On Prem

Platform

Beta

​8/5/2025

​7/25/2025

​7/20/2025

​7/15/2025

​7/1/2025

​6/26/2025

​6/18/2025

​6/3/2025

​5/27/2025

​3/26/2025

​1/30/2025

​1/3/2025

​12/4/2024

​12/3/2024

​12/2/2024

​11/27/2024

​10/31/2024

​10/28/2024

​10/25/24

​10/23/24

​10/21/24

​8/19/2024

​7/20/2024

​7/12/2024

​7/10/2024

​7/6/2024

​7/3/2024

​7/2/2024

​6/26/2024

​6/25/2024

​6/21/2024

​6/18/2024

​5/31/2024

8/5/2025

7/25/2025

7/20/2025

7/15/2025

7/1/2025

6/26/2025

6/18/2025

6/3/2025

5/27/2025

3/26/2025

1/30/2025

1/3/2025

12/4/2024

12/3/2024

12/2/2024

11/27/2024

10/31/2024

10/28/2024

10/25/24

10/23/24

10/21/24

8/19/2024

7/20/2024

7/12/2024

7/10/2024

7/6/2024

7/3/2024

7/2/2024

6/26/2024

6/25/2024

6/21/2024

6/18/2024

5/31/2024