Document Data Extraction
Copy for LLM
Copy page as Markdown for LLMs
View as Markdown
Open this page as Markdown
Open in ChatGPT
Get insights from ChatGPT
Open in Claude
Get insights from Claude

Extract structured data from any document type (PDFs, Word documents, Excel files, images, scanned documents) by providing a file and a JSON schema.

Overview

The Document Data Extraction API uses an intelligent multi-stage processing pipeline to automatically detect the best extraction strategy, handle both typed and scanned documents, and use AI to structure the extracted content according to your schema.

Authentication

All endpoints require Bearer token authentication:

X-API-Key: YOUR_ACCESS_TOKEN

Get your API key from Freddy

How It Works

Processing Pipeline

Document Analysis: Automatically detects document type and determines optimal extraction strategy
Text Extraction:
- Fast Path: Direct text extraction for typed PDFs, Word, and Excel documents
- OCR Path: Advanced OCR with image preprocessing for images and scanned documents
Data Structuring: Uses AI with JSON mode to structure extracted text according to your schema
Validation: Validates extracted data against your schema and adjusts confidence scores
Quality Assurance: Automatic retry with enhanced settings if confidence is low (<60%)

Automatic Strategy Selection

The system automatically chooses the best extraction method:

Typed Documents (PDF, DOCX, XLSX): Direct text extraction → Fast & cost-effective (~$0.001 per document)
Images/Scanned Documents: OCR with preprocessing → High accuracy (~$0.01-0.05 per document)
Fallback: If direct extraction fails, automatically falls back to OCR

OCR Image Preprocessing

For images and scanned documents, the system applies advanced preprocessing to improve OCR accuracy:

Grayscale conversion: Removes color noise
Contrast enhancement: 2x contrast boost for better text visibility
Sharpening: Enhances text edges
Denoising: Removes background noise
Brightness adjustment: 1.2x brightness boost

This preprocessing significantly improves OCR accuracy for handwritten text, low-quality scans, and faded documents.

Supported File Types

Documents: PDF (.pdf), Word (.docx), Excel (.xlsx) Images: JPEG (.jpg, .jpeg), PNG (.png), GIF (.gif), BMP (.bmp), TIFF (.tiff)

File Size Limit: 50 MB

Key Features

Automatic document type detection
Intelligent extraction strategy selection
Support for typed and scanned documents
Advanced OCR with image preprocessing
Schema-based data structuring
Detailed confidence metrics
Automatic quality improvement (retry on low confidence)
Batch processing (up to 50 documents)
Custom prompt support
Cost optimization

Use Cases

Timesheet Processing - Extract work hours, shifts, and employee data from timesheets (See guide)
Invoice and receipt processing
Form data extraction
Document digitization
Data entry automation
Contract analysis
ID card and passport scanning
Medical record processing
Financial document parsing

Rate Limits

Tier	Requests/Day	Concurrent Jobs	Max File Size
Free	100	1	10 MB
Basic	1,000	3	25 MB
Pro	10,000	10	50 MB
Enterprise	Custom	Custom	Custom

Pricing

Costs are calculated based on the processing method used:

Method	Cost Range	When Used
Direct Text Extraction	~$0.001 per document	Typed PDFs, Word, Excel
OCR + LLM Extraction	~$0.01-0.05 per document	Images, scanned documents, handwritten text
Retry (Low Confidence)	+$0.001-0.005 per document	Automatic retry if confidence < 60%

Cost Optimization Tips

Use Typed Documents: PDFs with selectable text are 10-50x cheaper than scanned images
Choose Appropriate Model: Use gpt-4o for most cases, gpt-4o only when needed
Batch Processing: Process multiple documents in one batch for better efficiency
Optimize Images: Reduce image resolution to 300 DPI (sufficient for OCR)
Pre-process Documents: Convert scanned PDFs to typed PDFs when possible

Next Steps

Extract Document Data - Extract structured data from a single document
Batch Document Extraction - Process multiple documents in parallel
Get Job Status - Check the status of an extraction job
Analyze Image - Analyze images using Vision API

Document Data ExtractionCopyCopy for LLMCopy page as Markdown for LLMsView as MarkdownOpen this page as MarkdownOpen in ChatGPTGet insights from ChatGPTOpen in ClaudeGet insights from Claude