CardinalCardinal
⌘K
Contact usTry the playground
Cardinal
API Documentation

Getting Started

QuickstartPlayground

Core Functions

Process DocumentOutput Formats

Enterprise

SecurityOn-Premise Setup
Contact usTry the playground
Core Functions

Output Formats

Cardinal OCR supports multiple output formats: JSON with custom schemas, Markdown, signature extraction, and barcode detection. Each format includes comprehensive metadata and precise coordinate data.

JSON Format

Structured Data with Custom Schemas
Perfect for APIs, databases, and programmatic processing. JSON schemas shown on this website are examples - you can input your own custom schema via our API.

Best for:

  • • API integrations and data pipelines
  • • Database storage and indexing
  • • Machine learning and data analysis
  • • Automated document processing workflows
{
  "status": "success",
  "document_id": "doc_abc123",
  "content": {
    "text": "Invoice #12345\nDate: 2024-01-15\nAmount: $1,250.00\nBill To: Acme Corp",
    "structured_data": {
      "document_type": "invoice",
      "invoice_number": "12345",
      "date": "2024-01-15",
      "amount": 1250.00,
      "currency": "USD",
      "vendor": {
        "name": "Cardinal Services",
        "address": "123 Main St, San Francisco, CA"
      },
      "customer": {
        "name": "Acme Corp",
        "address": "456 Business Ave, New York, NY"
      },
      "line_items": [
        {
          "description": "Professional Services",
          "quantity": 1,
          "unit_price": 1250.00,
          "total": 1250.00
        }
      ]
    }
  },
  "metadata": {
    "page_count": 1,
    "processing_time_ms": 1250,
    "confidence_score": 0.98,
    "language": "en"
  }
}

Custom Schema Support

The JSON schemas shown are examples. Input your own custom schema via our API to get structured data that matches your exact requirements, field names, and data types.

Markdown Format

Human-Readable Text
Clean, formatted text that's easy to read and edit

Best for:

  • • Documentation and content management
  • • Blog posts and articles
  • • README files and technical writing
  • • Human review and editing workflows
# Invoice #12345

**Date:** 2024-01-15  
**Amount:** $1,250.00

## Vendor Information
**Cardinal Services**  
123 Main St  
San Francisco, CA

## Bill To
**Acme Corp**  
456 Business Ave  
New York, NY

## Line Items

| Description | Quantity | Unit Price | Total |
|-------------|----------|------------|-------|
| Professional Services | 1 | $1,250.00 | $1,250.00 |

---

**Total Amount Due:** $1,250.00

Signature Extraction

Handwritten & Digital Signatures
Detect and extract signatures with bounding box coordinates and confidence scores

Signature Types:

  • • Handwritten signatures
  • • Digital signatures
  • • Initials and stamps
  • • Signature fields and validation
{
  "signatures": [
    {
      "id": "sig_1",
      "bounding_box": {"x": 400, "y": 650, "width": 150, "height": 50},
      "confidence": 0.94,
      "type": "handwritten_signature",
      "page": 1,
      "extracted_image_url": "https://cdn.trycardinal.ai/signatures/sig_1.png"
    },
    {
      "id": "sig_2", 
      "bounding_box": {"x": 100, "y": 700, "width": 120, "height": 40},
      "confidence": 0.89,
      "type": "digital_signature",
      "page": 1,
      "signer_name": "John Doe"
    }
  ],
  "signature_fields": [
    {
      "field_name": "client_signature",
      "required": true,
      "found": true,
      "signature_id": "sig_1"
    }
  ]
}

Barcode & QR Code Extraction

Machine-Readable Code Detection
Extract and decode barcodes, QR codes, and data matrix codes with precise positioning

Supported Formats:

  • • QR codes with decoded content
  • • Barcodes (Code 128, Code 39, UPC, etc.)
  • • Data Matrix codes
  • • PDF417 and Aztec codes
{
  "barcodes": [
    {
      "type": "QR_CODE",
      "value": "https://example.com/invoice/12345",
      "bounding_box": {"x": 450, "y": 100, "width": 80, "height": 80},
      "page": 1
    },
    {
      "type": "CODE_128",
      "value": "INV-12345-2024",
      "bounding_box": {"x": 50, "y": 600, "width": 200, "height": 40},
      "page": 1
    },
    {
      "type": "UPC_A",
      "value": "123456789012",
      "bounding_box": {"x": 300, "y": 400, "width": 150, "height": 30},
      "page": 1
    }
  ],
  "data_matrix": [
    {
      "value": "DM123456789",
      "bounding_box": {"x": 500, "y": 200, "width": 60, "height": 60},
      "page": 1
    }
  ]
}

Images, Bounding Boxes & Metadata

Advanced Visual Element Extraction
Extract images, charts, and precise coordinate data for all document elements

What we extract:

  • • Images and graphics with metadata
  • • Charts, diagrams, and visual elements
  • • Precise bounding box coordinates for all elements
  • • Confidence scores and element classifications
  • • Redlines, annotations, and markup
{
  "images": [
    {
      "id": "img_1",
      "url": "https://cdn.trycardinal.ai/images/doc_abc123_img_1.png",
      "bounding_box": {"x": 100, "y": 50, "width": 200, "height": 150},
      "description": "Company logo",
      "type": "logo",
      "confidence": 0.95
    }
  ],
  "charts": [
    {
      "id": "chart_1",
      "type": "bar_chart",
      "title": "Monthly Revenue",
      "description": "Revenue breakdown by month",
      "data_points": [
        {"label": "Jan", "value": 1000},
        {"label": "Feb", "value": 1250},
        {"label": "Mar", "value": 1100}
      ],
      "bounding_box": {"x": 50, "y": 300, "width": 400, "height": 200}
    }
  ],
  "bounding_boxes": [
    {
      "text": "Invoice #12345",
      "confidence": 0.99,
      "coordinates": {"x": 50, "y": 100, "width": 150, "height": 20},
      "page": 1,
      "font_size": 18,
      "is_bold": true
    }
  ]
}

Advanced Metadata Extraction

  • • Bounding Boxes: Precise coordinate data for every text element and visual component
  • • Visual Elements: Automatic detection of charts, diagrams, logos, and graphics
  • • Annotations: Redlines, markup, sticky notes, and document modifications
  • • Typography: Font sizes, styles, and formatting information

Choosing the Right Format

Select the output format based on your specific use case and requirements:

Choose JSON for:
  • • Data parsing and structured extraction
  • • API integrations and microservices
  • • Database storage and indexing
  • • Machine learning and analytics
  • • Automated processing workflows
Choose Markdown for:
  • • Content searching and text analysis
  • • Documentation and knowledge bases
  • • Blog posts and articles
  • • Human review workflows
  • • Simple text processing
Add Specialized Extraction when:
  • • You need to extract signatures for document validation
  • • Processing documents with barcodes or QR codes
  • • Requiring precise positioning data for layout reconstruction
  • • Building applications that need complete document understanding
  • • Extracting logos, charts, or other graphical elements
On this page
JSON FormatMarkdown FormatSignature ExtractionBarcode ExtractionImages & MetadataChoosing a Format