Skip to main content
October 18, 2025
Launched the Cardinal Benchmark — public dataset and evaluation scripts for OCR and layout models.
October 14, 2025
Released OCR Showdown — benchmarked Cardinal against industry models across 200+ documents.
October 10, 2025
Published “How We Work Under the Hood” — new blog post explaining our document intelligence pipeline.
October 7, 2025
Launched /compare endpoint — compare multiple extractions for accuracy benchmarking.
October 3, 2025
Added figure digitization — automatically converts embedded charts and figures into structured data.
September 29, 2025
Added dense PDF parsing — improved OCR accuracy for heavy-layout and scanned documents.
September 25, 2025
Added barcode and signature detection
September 20, 2025
Added image and image metadata detection
September 16, 2025
Added image file support — platform now accepts both PDFs and image uploads.
September 13, 2025
Launched /rag endpoint with bounding boxes — generate RAG-ready data with spatial awareness.
September 9, 2025
Launched /split endpoint — enables intelligent document splitting by sections and types.
September 6, 2025
Launched /extract (fast) endpoint — optimized version for faster structured extraction.
September 3, 2025
Launched /extract endpoint (slow version) — first iteration of structured data extraction.
September 1, 2025
Launched /markdown endpoint — converts documents into clean Markdown with layout preservation.
I