/api/v1/marker
(docs here) to generate parse output followed by a request to /api/v1/marker/prompt
to apply your prompt.
Here is an example in Python:
- A new output format:
chunks
- A new capability:
save_checkpoint
The chunks
output format
The Marker Prompt API only works on Marker requests that use the chunks
output format.
The chunks
output format looks a lot like our json
format (documented here) with two important changes:
- All blocks are flattened: every block will be flattened into a list.
- Because all blocks are flattened:
- … there are no page blocks in the output.
- … only top-level blocks on each page are are in the output.
- … there are no
children
on blocks. - … the
html
field will render HTML from all nested children without recursive references
Response Fields
When you setoutput_format
to chunks
in your Marker request, all the response fields will be the same (see them here) except you will also have a new key, chunks
.
The chunks
key contains a list of JSON objects, each of which has these fields:
id
is the block idblock_type
is the block typepage
is the page numbersection_hierarchy
indicates the section that the block is part ofhtml
contains fully-rendered HTML without recursive references to child blocks (which are not available inchunks
output)bbox
is an[x1, y1, x2, y2]
bounding box for the blockpolygon
is a 4-corner version ofbbox
in[[x1,y1], [x2,y2], [x3,y3], [x4,y4]]
formatimages
is a JSON object with block ID keys and base64-encoded image data values
Rendering images in HTML
When your chunks have images in them, you’ll see them rendered inhtml
like this: <img src='/page/0/Figure/9'>
.
The string in src
is a key to the images
field in the chunk. To render images, you’ll need to substitute that key with the base64-encoded image value in the images
field described above.
save_checkpoint
is required for the Marker Prompt API
In order to use the Marker Prompt API, your initial Marker request must set save_checkpoint
to True as shown in the example on this page.
save_checkpoint
saves a caches of your initial Marker request parse in our system temporarily and returns a checkpoint_id
in the final result. The /api/v1/marker/prompt
endpoint requires this checkpoint_id
.
save_checkpoint
is a new feature and is currently only used with the Marker Prompt API. We’ll be using save_checkpoint
for more upcoming document post-processing features and rolling out convenient ways to parse and apply prompts in one pass over time in our API and SDK.