Accuracy Overview

When evaluating document intelligence tools, accuracy is usually the first question:
  • How do you compare to other OCR tools and/or LLM-based approaches?
  • Have you run benchmarks?
This page summarizes how Cardinal performs and what makes it different.

Benchmarks & Results

We’ve evaluated Cardinal on a range of public benchmarks as well as internal test suites.
  • Full results: Email us at team@trycardinal.ai to get access!
  • Performance: Cardinal consistently matches or outperforms other tools and generic LLMs on dense tables, structured forms, and real-world edge cases.
  • Strengths: excels at complex tables, small text, mixed layouts, and annotations that typically break baseline LLMs.
  • Limitations: as with any system, extremely degraded scans, messy handwriting, or corrupted PDFs may still require human review.

Why Not Just an LLM?

It’s tempting to just “give the PDF to GPT/Gemini” and hope for structured output. The problem is:
  • LLMs hallucinate — they may invent rows/columns or misalign values.
  • LLMs lose structure — PDFs with tables, checkmarks, barcodes, or multiple columns rarely survive a naive LLM parse.
  • Cardinal preserves fidelity — we give you bounding boxes, cropped images, and schema-mapped JSON so you know where data came from.
Cardinal is not “just an LLM wrapper.” We combine OCR, layout models, and post-processing layers with selective LLM use. This hybrid approach gives deterministic structure with AI flexibility where it matters.

How are you different than other OCR tools?

Put it simply, we’re more accurate! We made our API open for a reason, try us out!