/api/v1/marker (docs here) to generate parse output followed by a request to /api/v1/marker/prompt to apply your prompt.
Here is an example in Python:
- A new output format:
chunks - A new capability:
save_checkpoint
The chunks output format
The Marker Prompt API only works on Marker requests that use the chunks output format.
The chunks output format looks a lot like our json format (documented here) with two important changes:
- All blocks are flattened: every block will be flattened into a list.
- Because all blocks are flattened:
- … there are no page blocks in the output.
- … only top-level blocks on each page are are in the output.
- … there are no
childrenon blocks. - … the
htmlfield will render HTML from all nested children without recursive references
Response Fields
When you setoutput_format to chunks in your Marker request, all the response fields will be the same (see them here) except you will also have a new key, chunks.
The chunks key contains a list of JSON objects, each of which has these fields:
idis the block idblock_typeis the block typepageis the page numbersection_hierarchyindicates the section that the block is part ofhtmlcontains fully-rendered HTML without recursive references to child blocks (which are not available inchunksoutput)bboxis an[x1, y1, x2, y2]bounding box for the blockpolygonis a 4-corner version ofbboxin[[x1,y1], [x2,y2], [x3,y3], [x4,y4]]formatimagesis a JSON object with block ID keys and base64-encoded image data values
Rendering images in HTML
When your chunks have images in them, you’ll see them rendered inhtml like this: <img src='/page/0/Figure/9'>.
The string in src is a key to the images field in the chunk. To render images, you’ll need to substitute that key with the base64-encoded image value in the images field described above.
save_checkpoint is required for the Marker Prompt API
In order to use the Marker Prompt API, your initial Marker request must set save_checkpoint to True as shown in the example on this page.
save_checkpoint saves a caches of your initial Marker request parse in our system temporarily and returns a checkpoint_id in the final result. The /api/v1/marker/prompt endpoint requires this checkpoint_id.
save_checkpoint is a new feature and is currently only used with the Marker Prompt API. We’ll be using save_checkpoint for more upcoming document post-processing features and rolling out convenient ways to parse and apply prompts in one pass over time in our API and SDK.