TL;DR
Cardinal converts PDFs and images into clean, structured Markdown using this endpoint. Table handling options:- Default: When tables are present, everything (including text) is returned as HTML
- Option 1: Set
markdownPreserveTables: true→ Tables stay HTML, everything else converts to Markdown - Option 2: Set
markdown: true→ Everything converts to Markdown- ⚠️ Warning: Markdown tables may lose fidelity (merged cells, nested tables, etc.)
- Annotations
- Checkmarks
- Spanning tables (row/column merges)
- Complex tables (multi-level, nested)
- And more.
Endpoint
POSThttps://api.trycardinal.ai/splitContent-Type:
multipart/form-dataAuth:
X-API-KEY: <API_KEY>
You may provide eitherfileorfileUrl.
Pagination
Large documents are processed in pages of up to 100 to prevent oversized responses.Each API call returns a batch of up to 100 pages, along with a
pagination object in the response.
Pagination Fields
start_page— First page returned in this batchend_page— Last page returned in this batchpage_limit— Maximum number of pages per batch (default: 100)has_more_pages— Whether additional pages remainnext_start_page— Use this value to request the next batch
Example Usage
You can paginate sequentially or in parallel:-
Sequential flow:
- Call
/markdownwith no pagination params → returns pages 1–100. - Check
has_more_pages: true→ usenext_start_page: 101. - Call
/markdownagain withstartPage=101→ fetches the next batch.
- Call
-
Parallel flow:
You can launch multiple requests at once usingstartPageoffsets (e.g., 1, 101, 201, …) to fetch batches concurrently.