Extract

Extract Structured Data by Schema

curl --request POST \
  --url https://api.trycardinal.ai/extract \
  --header 'Content-Type: multipart/form-data' \
  --header 'x-api-key: <api-key>' \
  --form 'file=<string>' \
  --form 'schema=<string>' \
  --form 'fileUrl=<string>' \
  --form fast=false \
  --form 'customContext=<string>' \
  --form imageMetadataDetect=false \
  --form 0.file='@example-file' \
  --form 1.file='@example-file'

{
  "response": "<string>",
  "method": "fast",
  "pages_processed": 123,
  "confidence_score": 123,
  "image_metadata": [
    {
      "figure_id": 123,
      "cropped_image_url": "<string>",
      "bounding_box": {
        "original": {
          "x": 123,
          "y": 123,
          "w": 123,
          "h": 123
        },
        "pixel": {
          "x": 123,
          "y": 123,
          "w": 123,
          "h": 123
        }
      },
      "caption": "<string>",
      "metadata": {},
      "subfigure_count": 123,
      "processing_times": {
        "detection_ms": 123,
        "crop_ms": 123,
        "ocr_ms": 123,
        "total_ms": 123
      }
    }
  ]
}

POST

extract

Extract Structured Data by Schema

curl --request POST \
  --url https://api.trycardinal.ai/extract \
  --header 'Content-Type: multipart/form-data' \
  --header 'x-api-key: <api-key>' \
  --form 'file=<string>' \
  --form 'schema=<string>' \
  --form 'fileUrl=<string>' \
  --form fast=false \
  --form 'customContext=<string>' \
  --form imageMetadataDetect=false \
  --form 0.file='@example-file' \
  --form 1.file='@example-file'

{
  "response": "<string>",
  "method": "fast",
  "pages_processed": 123,
  "confidence_score": 123,
  "image_metadata": [
    {
      "figure_id": 123,
      "cropped_image_url": "<string>",
      "bounding_box": {
        "original": {
          "x": 123,
          "y": 123,
          "w": 123,
          "h": 123
        },
        "pixel": {
          "x": 123,
          "y": 123,
          "w": 123,
          "h": 123
        }
      },
      "caption": "<string>",
      "metadata": {},
      "subfigure_count": 123,
      "processing_times": {
        "detection_ms": 123,
        "crop_ms": 123,
        "ocr_ms": 123,
        "total_ms": 123
      }
    }
  ]
}

Authorizations

x-api-key

string

header

required

Body

multipart/form-data

Option 1
Option 2

file

required

PDF or image to upload (required if no fileUrl). Allowed: .pdf, .jpg, .jpeg, .png

schema

string

required

Required schema definition describing the fields to extract.

fileUrl

string<uri>

Publicly accessible URL of the file to process (required if no file).

fast

boolean

default:false

Fast path that extracts directly from pages without full pipeline post-processing.

customContext

string

Optional additional context or instructions to guide the extraction process. Useful for providing domain-specific guidance or clarifications about the document.

imageMetadataDetect

boolean

default:false

If true, includes image metadata in each page of the response.

Response

Successful schema extraction

response

string

required

Model's structured output matching the provided schema.

method

enum<string>

Available options:

fast,

slow

pages_processed

integer

Present in slow mode.

confidence_score

number

Confidence score (0-100) indicating extraction reliability, present in slow mode.

image_metadata

object[]

Image metadata entries for this extraction (present if imageMetadataDetect=true).

Show child attributes

Split

Markdown

⌘I

API documentation

Endpoint examples

Authorizations

Body

Response