Datalab Marker Prompt API
Datalab’s Marker Prompt API allows you to use natural language to correct or tailor Marker output to your preferences. It’s designed for cases when you want to nudge Marker in a different direction and steer its output. You can use it to:- Merge tables across pages.
- Correct OCR errors.
- Fill in missing data.
- Handle unique edge cases you encounter with your documents.
Prompting Tips
We recommend using Forge Playground to evaluate your prompts and iterate on them. The same prompting tips apply to the Marker Prompt API that do generally: you want to be as explicit as you can in your instructions and provide context for your changes. The context we provide alongside your prompt includes:- Our own prompt to have the LLM adhere to yours as closely as possible and provide other conext (e.g. “we’ll show you an image of a page + JSON blocks, here is how those blocks are formatted, here is how you can use page bounding boxes are normalized as evidence when deciding to make changes”, etc.)
- Blocks formatted as JSON, each of which has an
html
key. - Images of pages
- Block rewriting: we use your prompt to decide if blocks need to be rewritten on every single page.
- Cross-page merging: we use your prompt do decide if blocks need to be merged across every two pages.