Understanding Confidence Metrics
Copy for LLM
Copy page as Markdown for LLMs
View as Markdown
Open this page as Markdown
Open in ChatGPT
Get insights from ChatGPT
Open in Claude
Get insights from Claude

Every extraction includes detailed confidence metrics to help you assess the quality of the results.

Confidence Score Breakdown

{
  "confidence": {
    "image_analysis": 85,
    "data_extraction": 90,
    "validation": 88,
    "overall": 88
  }
}

Metric Definitions

image_analysis (0-100) Quality of the source image or text. Based on OCR confidence or PDF text extraction quality.

data_extraction (0-100) Accuracy of data extraction from the text. Based on LLM extraction confidence.

validation (0-100) Schema validation success rate. Reduced by 5% per missing required field or type mismatch (max 30% penalty).

overall (0-100) Combined confidence score. Average of all metrics.

Confidence Score Ranges

Score Range	Quality	Recommendation
90-100	Excellent	Data is highly reliable, use with confidence
75-89	Good	Data is generally reliable, spot-check important fields
60-74	Fair	Review extracted data carefully
0-59	Poor	Manual review required, consider re-processing

Automatic Quality Improvements

The system automatically improves extraction quality:

Low Confidence Retry

If overall confidence < 60%, the system automatically retries with enhanced settings:

Increased OCR preprocessing
More detailed extraction prompts
Stricter validation

Validation Penalties

Missing required fields or type mismatches reduce validation confidence:

Missing required field: -5% per field
Type mismatch: -5% per field
Maximum penalty: -30%

OCR Preprocessing

Images are automatically preprocessed to improve OCR accuracy:

Grayscale conversion
Contrast enhancement (2x)
Sharpening
Denoising
Brightness adjustment (1.2x)

Confidence Blending

Combines confidence from multiple sources for accurate overall score:

OCR confidence (if applicable)
LLM extraction confidence
Schema validation results

Real-World Examples

High Confidence (95%)

Clean typed PDF invoice with clear text.

{
  "confidence": {
    "image_analysis": 98,
    "data_extraction": 95,
    "validation": 92,
    "overall": 95
  }
}

Characteristics:

Typed PDF with selectable text
Clear formatting and structure
All required fields present
No type mismatches

Medium Confidence (75%)

Handwritten timesheet with some unclear text.

{
  "confidence": {
    "image_analysis": 74,
    "data_extraction": 78,
    "validation": 73,
    "overall": 75
  }
}

Characteristics:

Handwritten text (harder to read)
Some unclear characters
Most fields extracted successfully
Minor validation issues

Low Confidence (55%)

Faded scan with poor image quality.

{
  "confidence": {
    "image_analysis": 52,
    "data_extraction": 60,
    "validation": 53,
    "overall": 55
  }
}

Characteristics:

Poor image quality (faded, low contrast)
OCR struggled with text recognition
Missing some required fields
Multiple validation errors

Action: System automatically retries with enhanced settings.

Using Confidence Scores

Flag for Review

Flag documents with low confidence for manual review.

result = extract_document(file, schema)

if result['confidence']['overall'] < 75:
    flag_for_review(result['job_id'], result['confidence'])

Conditional Processing

Apply different processing based on confidence.

confidence = result['confidence']['overall']

if confidence >= 90:
    # Auto-approve
    approve_document(result['extracted_data'])
elif confidence >= 75:
    # Spot-check critical fields
    if validate_critical_fields(result['extracted_data']):
        approve_document(result['extracted_data'])
    else:
        flag_for_review(result['job_id'])
else:
    # Manual review required
    flag_for_review(result['job_id'])

Track Quality Metrics

Monitor confidence scores over time to identify trends.

# Track average confidence by document type
invoice_avg_confidence = 92  # Excellent
timesheet_avg_confidence = 78  # Good, but could improve
receipt_avg_confidence = 85  # Good

# Identify problem areas
if timesheet_avg_confidence < 80:
    # Consider improving prompt or image quality
    improve_timesheet_processing()

Alert on Low Confidence

Send alerts when confidence drops below threshold.

result = extract_document(file, schema)

if result['confidence']['overall'] < 60:
    send_alert(
        f"Low confidence extraction: {result['job_id']} "
        f"(confidence: {result['confidence']['overall']}%)"
    )

Improving Confidence Scores

Improve Image Quality

Scan at 300 DPI or higher
Use good lighting
Ensure text is clear and legible
Avoid shadows and glare

Optimize Schema

Use correct field types
Make optional fields nullable
Avoid overly complex nested structures
Test schema with sample documents

Refine Prompts

Provide clear extraction instructions
Specify format requirements
Handle edge cases explicitly
Guide AI on unclear text handling

Use Better Source Documents

Prefer typed PDFs over scans
Convert scanned PDFs to typed when possible
Clean up documents before scanning
Use high-quality scanners

Understanding Confidence MetricsCopyCopy for LLMCopy page as Markdown for LLMsView as MarkdownOpen this page as MarkdownOpen in ChatGPTGet insights from ChatGPTOpen in ClaudeGet insights from Claude

Confidence Score Breakdown

Metric Definitions

Confidence Score Ranges

Automatic Quality Improvements

Low Confidence Retry

Validation Penalties

OCR Preprocessing

Confidence Blending

Real-World Examples

High Confidence (95%)

Medium Confidence (75%)

Low Confidence (55%)

Using Confidence Scores

Flag for Review

Conditional Processing

Track Quality Metrics

Alert on Low Confidence

Improving Confidence Scores

Improve Image Quality

Optimize Schema

Refine Prompts

Use Better Source Documents

Related Resources

Was this helpful?

Understanding Confidence Metrics
Copy for LLM
Copy page as Markdown for LLMs
View as Markdown
Open this page as Markdown
Open in ChatGPT
Get insights from ChatGPT
Open in Claude
Get insights from Claude