Changelog

October 7, 2025

Added end_page as a parameter in the API.
Improved infrastructure significantly — memory handling is now much more efficient, allowing far more parallel runs.
Added ephemeral flag to prevent storage of images, signatures, barcodes, etc.

October 18, 2025

Launched the Cardinal Benchmark — public dataset and evaluation scripts for OCR and layout models.

October 14, 2025

Released OCR Showdown — benchmarked Cardinal against industry models across 200+ documents.

October 10, 2025

Published “How We Work Under the Hood” — new blog post explaining our document intelligence pipeline.

October 7, 2025

Launched /compare endpoint — compare multiple extractions for accuracy benchmarking.

October 3, 2025

Added figure digitization — automatically converts embedded charts and figures into structured data.

September 29, 2025

Added dense PDF parsing — improved OCR accuracy for heavy-layout and scanned documents.

September 25, 2025

Added barcode and signature detection

September 20, 2025

Added image and image metadata detection

September 16, 2025

Added image file support — platform now accepts both PDFs and image uploads.

September 13, 2025

Launched /rag endpoint with bounding boxes — generate RAG-ready data with spatial awareness.

September 9, 2025

Launched /split endpoint — enables intelligent document splitting by sections and types.

September 6, 2025

Launched /extract (fast) endpoint — optimized version for faster structured extraction.

September 3, 2025

Launched /extract endpoint (slow version) — first iteration of structured data extraction.

September 1, 2025

Launched /markdown endpoint — converts documents into clean Markdown with layout preservation.

Introduction

Building Blocks

Accessories

Eval

Common Questions

Recipes

Security

On-Premise VPC Deployment

Uptime

Changelog