Headers

POSThttps://api.aitronos.com/v1/scrape

Extract structured data from a single URL using a JSON schema.

Headers
Copy for LLM
Copy page as Markdown for LLMs
View as Markdown
Open this page as Markdown
Open in ChatGPT
Get insights from ChatGPT
Open in Claude
Get insights from Claude

Name	Type	Required	Description
`Authorization`	string	Yes	Bearer token authentication
`Content-Type`	string	Yes	Must be `application/json`

Request Body

url string required

Target URL to scrape.

schema object required

JSON schema defining the data structure to extract.

organization_id string optional

Organization ID for billing and access control.

options object optional

Scraping configuration options.

Field	Type	Required	Description
`url`	string	Yes	Target URL to scrape
`schema`	object	Yes	JSON schema defining data structure
`options`	object	No	Scraping configuration options

Schema Object

Field	Type	Required	Description
`type`	string	Yes	Must be "object"
`properties`	object	Yes	Field definitions with types

Supported Field Types: string, number, boolean, array, object, url, date, email

Options Object

Field	Type	Default	Description
`max_items`	integer	100	Maximum items to extract (1-1000)
`timeout`	integer	30	Processing timeout in seconds (1-300)
`sync`	boolean	false	Wait for results (true) or return job ID (false)
`wait_for_content`	boolean	true	Wait for dynamic content to load
`extract_images`	boolean	false	Extract image URLs from content
`follow_pagination`	boolean	false	Follow pagination links
`llm_mode`	string	"structured"	LLM processing mode: "structured" or "json"
`date_filter`	object	null	Filter items by date
`engine_type`	string	null	Force engine: "static", "browser", or "hybrid"

Date Filter Object

Field	Type	Required	Description
`field`	string	Yes	Date field name in schema
`after`	string	No	Include items after this date (YYYY-MM-DD)
`before`	string	No	Include items before this date (YYYY-MM-DD)

Response (Synchronous Mode)

Status: 200 OK

Field	Type	Description
`job_id`	string	Unique job identifier
`status`	string	Job status: "completed"
`url`	string	The scraped URL
`extracted_data`	array	Array of extracted items
`metadata`	object	Processing metadata
`created_at`	string	Job creation timestamp (ISO 8601)
`completed_at`	string	Job completion timestamp (ISO 8601)

Metadata Object

Field	Type	Description
`processing_time`	number	Processing time in seconds
`engine_used`	string	Engine used: "static", "browser", or "hybrid"
`llm_calls`	integer	Number of LLM API calls made
`cache_hit`	boolean	Whether result was served from cache
`field_mappings`	object	How schema fields mapped to website fields
`site_analysis`	object	Site complexity and characteristics

Response (Asynchronous Mode)

Status: 200 OK

Field	Type	Description
`job_id`	string	Unique job identifier
`status`	string	Job status: "pending"
`url`	string	The scraped URL
`extracted_data`	null	Null until completed
`metadata`	null	Null until completed
`created_at`	string	Job creation timestamp
`completed_at`	null	Null until completed

Returns

Returns a JSON response indicating success or failure.

Bash

curl -X POST https://api.aitronos.com/v1/scrape \
  -H "X-API-Key: $FREDDY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/products",
    "schema": {
      "type": "object",
      "properties": {
        "title": {"type": "string"},
        "price": {"type": "number"},
        "description": {"type": "string"},
        "availability": {"type": "string"}
      }
    },
    "options": {
      "max_items": 10,
      "timeout": 30,
      "sync": true
    }
  }'

Response 200 OK

{
  "job_id": "job_abc123def456",
  "status": "completed",
  "url": "https://example.com/products",
  "extracted_data": [
    {
      "title": "Product Name",
      "price": 29.99,
      "description": "Product description text",
      "availability": "In Stock"
    },
    {
      "title": "Another Product",
      "price": 49.99,
      "description": "Another description",
      "availability": "Out of Stock"
    }
  ],
  "metadata": {
    "processing_time": 2.5,
    "engine_used": "browser",
    "llm_calls": 1,
    "cache_hit": false,
    "field_mappings": {
      "title": {
        "website_field": "product_name",
        "confidence": 0.95
      },
      "price": {
        "website_field": "cost",
        "confidence": 0.92
      }
    },
    "site_analysis": {
      "complexity": "medium",
      "requires_js": true,
      "estimated_items": 12
    }
  },
  "created_at": "2024-12-16T10:30:00Z",
  "completed_at": "2024-12-16T10:30:02Z"
}

Scrape URL

HeadersCopyCopy for LLMCopy page as Markdown for LLMsView as MarkdownOpen this page as MarkdownOpen in ChatGPTGet insights from ChatGPTOpen in ClaudeGet insights from Claude