Skip to content
Last updated

Creates a model response for AI-powered text generation, structured outputs, and tool-augmented workflows.

POSThttps://api.aitronos.com/v1/model/response

Provide text or image inputs to generate text or JSON outputs. Have the model call your own custom code or use built-in tools like web search or file search.

Request Body

Core Parameters

organization_id string required

The unique identifier of the organization to which this request belongs. All API requests must be scoped to an organization for billing, access control, and resource management. Find your organization ID in Freddy Hub → Settings → Organization.

assistant_id string optional

ID of the assistant to use for this response. When provided, automatically applies the assistant's configured rules, instructions, and settings. Enables consistent AI behavior across conversations with priority-based rule application and context management. Learn more about assistants

rules array of strings optional

IDs of rules to attach to this response. Rules provide additional behavior modifications and can override default model behavior. Each string should be a valid rule ID (e.g., rule_abc123). Learn more about rules

disable_rules boolean optional · Defaults to false

Disable all rule application for this response. When true, no rules will be applied regardless of other rule settings. This parameter is only available when using API keys and cannot be used with bearer tokens (user authentication). Learn more about rules

model string optional · Defaults to ftg-3.0

The AI model to use for generating the response. Aitronos supports various models with different capabilities, performance characteristics, and pricing. Choose based on your task requirements: reasoning models for complex problems, fast models for simple queries, or vision models for image understanding. View available models →

inputs array required

Array of input message objects. Each message contains a role and content arrays (texts, images, audio, files). View full object →

Show properties

Properties

role string required

The role of the message input. One of user, system, or assistant.

texts array optional

Array of text content items.

Show structure

Each text item:

text string required

The text content.

Example: [{ "text": "Hello, how are you?" }]

images array optional

Array of image inputs.

Show structure

Each image item can have:

file_id string optional

Reference to an uploaded file.

url string optional

Direct image URL.

Example: [{ "file_id": "file_abc123" }] or [{ "url": "https://example.com/image.jpg" }]

audio array optional · in development

Array of audio inputs.

Show structure

Each audio item:

file_id string required

Reference to uploaded audio file.

Example: [{ "file_id": "file_audio123" }]

files array optional

Array of file attachments for context (PDFs, documents, etc.).

Show structure

Each file item:

file_id string required

Reference to uploaded file.

Example: [{ "file_id": "file_doc123" }]

id string optional

The unique ID of the input message. Populated when items are returned via API.

Thread and Context

thread string or object optional · Defaults to null

The thread that this response belongs to. Items from this thread are prepended to inputs for this response request. Input items and output items from this response are automatically added to this thread after completion. Learn more

If no thread is provided (or set to null), a new thread is automatically created for this request. The response will include the new thread ID in the thread field, which can be used in subsequent requests to maintain conversation context and history. This enables seamless multi-turn conversations without manual thread management. For stateless multi-turn conversations, use previous_response_id instead (see inputs documentation).

store boolean optional · Defaults to true

Controls whether the created thread is visible in the user's thread list. When true (default), the thread appears in the list threads API and is visible in the user interface. When false, the thread is hidden from the list threads API but remains accessible by direct thread ID lookup. This is useful for internal operations, background tasks, or temporary threads that shouldn't clutter the user's conversation history. Learn more

instructions string optional

System-level instructions that define the model's behavior and capabilities. These instructions override any previous system messages when continuing from a prior response, allowing you to dynamically adjust the model's context without carrying over old system prompts.

rule_context object optional

Contextual information provided to attached rules to influence their behavior and execution. Since rules are now explicitly attached by ID, this context helps rules understand the request characteristics and adapt their application accordingly.

Show properties

request_type string optional

Type of request being made. Examples: chat, technical_documentation, creative_writing, code_generation, data_analysis.

user_expertise string optional

User's expertise level. Values: beginner, intermediate, expert, professional. Provides context to attached rules about the expected user knowledge level, allowing rules to adjust their behavior accordingly.

output_format string optional

Desired output format. Examples: markdown, html, json, plain_text, code.

target_audience string optional

Intended audience for the response. Examples: developers, business_users, students, general_public.

Response Delivery

stream boolean optional · Defaults to false

Enable streaming mode to receive response data incrementally as it's generated, using server-sent events (SSE). When enabled, the model sends partial outputs in real-time rather than waiting for the complete response. Ideal for user-facing applications requiring immediate feedback. Learn more

output_mode string optional · Defaults to text

Controls the format of the model's response output. Determines how the response content is structured and returned.

Show available modes
  • text - Rich text with markdown formatting (default). Natural language responses with full markdown support.
  • plain - Plain text with all markdown formatting stripped. Useful when you need raw text without any formatting.
  • blocks - Structured blocks for custom UI rendering. Returns typed content blocks that can be rendered with custom components.
  • json - Valid JSON output. The model returns a JSON object or array. Useful for structured data extraction and API integrations.
  • json_schema - JSON output validated against a provided schema. Ensures the response matches your exact data structure requirements. Requires the json_schema parameter.

json_schema object optional

JSON schema definition for structured output validation. Required when output_mode is set to json_schema. Defines the exact structure the model must follow in its response.

Show schema structure

type string required

Root type of the schema. Typically object or array.

properties object optional

Object properties definition when type is object. Each property defines its own type and constraints.

required array optional

Array of required property names.

Example:

{
  "type": "object",
  "properties": {
    "name": { "type": "string" },
    "age": { "type": "number" },
    "email": { "type": "string" }
  },
  "required": ["name", "email"]
}

include array optional

Specify additional data to include in the response. Each value expands specific parts of the output with extra information. Learn more

Show possible types
  • all - Include all available additional data types (web search sources, code outputs, logs, etc.). Use this for comprehensive debugging and full visibility into the response generation process. Note: Significantly increases response size and processing time.
  • web_search.sources - Include source URLs and metadata from web search results
  • code_interpreter.outputs - Include Python execution outputs and generated files
  • computer_use.screenshots - Include screenshot URLs from computer interactions
  • file_search.results - Include matched document chunks and relevance scores
  • function_calls.logs - Include execution logs and timing data for function calls
  • function_calls.sources - Include source code context for function executions
  • message.input_images - Include full image URLs from user messages
  • message.output_images - Include generated image URLs from assistant responses
  • message.logprobs - Include output-level probability scores for generated content
  • reasoning.encrypted - Include encrypted reasoning data for stateless conversations
  • request.logs - Include detailed request processing logs
  • tools.available - Include list of tools that were available to the model during response generation
  • tools.used - Include list of tools that were actually used by the model during response generation
  • usage.detailed - Include synapse and neuron usage breakdown by component
  • rules - Include list of rules that were passed to this response, showing rule IDs, names, and basic metadata
  • rules.debug - Include detailed metadata about rule application, including which rules were considered, applied, compressed, or filtered

Limits and Controls

max_output_synapses integer optional

Maximum number of synapses the model can generate in the response, including both visible output and internal reasoning synapses. Use this to control response length and computational cost. Learn more

metadata object optional

Custom key-value pairs (up to 16) for attaching structured information to the response. Useful for tracking, categorization, or filtering responses in your application. Keys must be ≤64 characters, values ≤512 characters.

previous_response_id string optional

Reference a previous response to create multi-turn conversations while maintaining context. The model will use the outputs and state from the referenced response as the starting point for this new response. Cannot be used together with thread. Learn more

thread_context_mode string optional · Defaults to recent

Controls how thread history is managed when exceeding neuron capacity limits. Available modes: recent (keeps newest messages), smart (preserves start + recent), full (keeps all messages). Learn more

Prompt and Reasoning

prompt object optional

Reference to a reusable prompt template with variable substitution. Prompt templates allow you to define standardized instructions once and reuse them across requests with dynamic values. Learn more

Show properties

id string required

The unique identifier of the prompt template to use.

variables object optional

Key-value pairs to substitute into template variables. Values can be strings, numbers, or other input types like images or files.

version string optional

Specific version of the prompt template. If omitted, uses the latest version.

reasoning object optional

Unified reasoning configuration that works across all reasoning-capable models (OpenAI GPT-5/O-series, Anthropic Claude). Controls how much computational effort the model dedicates to internal chain-of-thought processing before generating the final output. The API automatically maps your effort level to provider-specific parameters. Learn more

Show properties

effort string optional · Defaults to medium

Controls the computational effort spent on reasoning/thinking. Higher effort produces more thorough analysis but increases response time and token usage.

Available values: off, low, medium, high, maximum

View provider mapping details →

summary string optional

Request a summary of the model's reasoning process.

Available values:

  • auto (model decides)
  • concise (brief overview)
  • detailed (comprehensive explanation)

Note: Some models do not stream reasoning content in real-time, only metadata events (reasoning.started, reasoning.completed). Other models stream full thinking content as it's generated.

Tools

tools array optional

Array of tools the model can invoke during response generation. Enables extended capabilities beyond text generation, including web search, file analysis, code execution, custom function calls, and personal connectors. Tool usage can be controlled via the tool_choice parameter. Learn more

Show possible types
System Tools

Built-in tools provided by Aitronos for common tasks like file search, web search, code execution, image generation, and computer control.

File Search object

Built-in tool for searching uploaded files and documents using vector similarity search. Retrieves relevant content from document collections to augment model responses with your data. Learn more

Show properties

type string required

The type of tool. Always set to file_search.

vector_store_ids array required

Array of vector store IDs to search. Vector stores contain your uploaded and indexed documents.

filters object optional

Query filters to narrow search results (e.g., by file type, date, metadata).

max_num_results integer optional

Maximum number of results to return per search. Must be between 1 and 50. Higher values provide more context but increase neuron usage.

ranking_options object optional

Advanced ranking configuration for search results.

Show ranking properties

ranker string optional

Ranking algorithm to use. Options: default, semantic, hybrid.

score_threshold number optional

Minimum relevance score (0.0 to 1.0) for results. Higher values return only highly relevant matches but may return fewer results.

Web Search object

Built-in tool for real-time internet search. Retrieves current information, news, and data beyond the model's training cutoff. Learn more

Show properties

type string required

The type of tool. One of web_search or web_search_2025_08_26 (version-specific).

search_context_size string optional · Defaults to medium

Controls how much neuron capacity the search results can consume. Higher values provide more comprehensive search context but increase neuron usage. Lower values are faster and more cost-effective but may miss relevant details. Available values: low, medium, high.

filters object optional

Search filters to control which domains are queried.

Show filter properties

allowed_domains array optional · Defaults to []

Restrict search to specific domains. If empty, all domains are allowed. Subdomains are automatically included (e.g., example.com includes blog.example.com).

Example: ["pubmed.ncbi.nlm.nih.gov", "nih.gov"]

user_location object optional

Approximate geographic location of the user to provide location-aware search results. Useful for local queries (e.g., "restaurants nearby", "weather today").

Show location properties

type string optional · Defaults to approximate

Type of location approximation. Always set to approximate for privacy.

city string optional

City name in free text format. Example: San Francisco, Tokyo, London.

region string optional

State, province, or region name. Example: California, Ontario, Bavaria.

country string optional

Two-letter ISO 3166-1 alpha-2 country code. Example: US, JP, GB, DE.

timezone string optional

IANA timezone identifier. Example: America/Los_Angeles, Europe/London, Asia/Tokyo.

Code Interpreter object · in development

Executes Python code in a secure sandbox environment to perform computations, data analysis, file processing, and visualization tasks. Ideal for mathematical calculations, data transformations, and generating charts. Learn more

Show properties

type string required

The type of tool. Always set to code_interpreter.

container string or object required

Specifies the code execution environment. Can be a container ID (string) for a pre-configured environment, or an object to auto-configure with specific files.

Show container types

String (Container ID):

Reference a pre-existing container by ID. Example: "container_abc123".

Object (Auto-configuration):

Automatically provision a container with specified files. Object properties:

type string required

Container type. Always set to auto for automatic provisioning.

file_ids array optional

Array of file IDs to make available in the code interpreter environment. Files must be uploaded via the Files API first. Example: ["file_abc123", "file_xyz789"].

Computer Use Preview object · in development

Experimental tool enabling the model to control a virtual computer environment. Allows interaction with applications, browsers, and system interfaces. Learn more

Show properties

type string required

The type of tool. Always set to computer_use_preview.

environment string required

Type of virtual environment to control. Options: desktop, browser, terminal.

display_width integer required

Virtual display width in pixels. Recommended: 1280-1920.

display_height integer required

Virtual display height in pixels. Recommended: 720-1080.

Personal Connectors

Access to external services and APIs through configured personal connectors. Enables the model to interact with services like Gmail, Google Calendar, Dropbox, and more. Learn more about personal connectors →

Personal Connectors object

Show properties

type string required

The type of tool. Always set to personalConnector.

configuration_ids array of strings optional

Specific personal connector configuration IDs to use. If not provided, all enabled configurations for the authenticated user or API key will be available. Each string should be a valid configuration ID (e.g., pconf_abc123).

Streamline Tools

Built-in tool for executing user-uploaded scripts on the Streamline platform. Allows running custom server-side scripts as part of the AI workflow. Scripts are uploaded separately to the Streamline platform and referenced by ID.

Streamline Tool object

Show properties

type string required

The type of tool. Always set to streamline.

script_id string required

The unique identifier of the uploaded script on the Streamline platform.

parameters object optional

Key-value pairs of parameters to pass to the script during execution. These are made available to the script as environment variables or input arguments.

environment string optional · Defaults to python

The runtime environment for the script. Available options: python, node, bash, ruby, php.

timeout integer optional · Defaults to 300

Maximum execution time in seconds before the script is terminated. Range: 1-64800 (18 hours). When timeout exceeds 5 minutes (300 seconds), background mode is automatically enabled to prevent request timeouts.

background boolean optional · Defaults to false

Run the script execution asynchronously in the background. When enabled, the API returns immediately with an execution ID while the script continues processing. Automatically enabled when timeout exceeds 5 minutes (300 seconds).

push_notification_on_completion boolean optional

Send a push notification when script execution completes. Useful for long-running scripts to notify users of completion status. Only applicable when background is true or when timeout exceeds 5 minutes.

require_approval boolean optional · Defaults to false

If true, requires user approval before executing the script. Useful for scripts with side effects.

Custom MCP

Connect to Model Context Protocol (MCP) servers to access additional tools and capabilities. MCP enables Claude to interact with external services, databases, APIs, and custom integrations through standardized protocols. Learn more

MCP Tool object

Show properties

type string required

The type of tool. Always set to mcp.

server_label string required

A unique label identifying this MCP server connection. Used to distinguish between multiple MCP servers in a single request and appears in tool calls and logs. Example: "database_server", "api_integration".

Server Connection

Choose one of the following methods to connect to an MCP server:

configuration_id string optional

ID of a saved MCP configuration. Use this to reference pre-configured connectors that have been set up in advance. See MCP Configurations API

server_url string optional

Direct URL to a custom MCP server endpoint. Use this for connecting to your own MCP-compatible servers or third-party MCP services. Example: "https://mcp.example.com/v1".

connector_id string optional

Identifier for built-in service connectors provided by Aitronos. These connectors handle authentication and connection setup automatically.

Show available connectors

Currently supported connector IDs:

  • connector_dropbox - Dropbox file storage
  • connector_gmail - Gmail email access
  • connector_googlecalendar - Google Calendar
  • connector_googledrive - Google Drive
  • connector_microsoftteams - Microsoft Teams
  • connector_outlookcalendar - Outlook Calendar
  • connector_outlookemail - Outlook Email
  • connector_sharepoint - SharePoint

Note: One of server_url, connector_id, or configuration_id must be provided.

Authentication

authorization string optional

OAuth access token or API key for authenticating with the remote MCP server. Required when using custom MCP servers via server_url or when using service connectors that require additional authentication. Your application handles the OAuth flow and provides the token here.

headers object optional

Custom HTTP headers to send with requests to the MCP server. Used for authentication, API versioning, or other server-specific requirements. Example: {"X-API-Key": "your-key", "Authorization": "Bearer token"}.

Server Configuration

server_description string optional

Optional description of the MCP server's capabilities and purpose. This helps the model understand when and how to use tools from this server. Example: "Internal analytics database with user query tools".

Tool Filtering

allowed_tools array or object optional

Restrict which tools from the MCP server the model can invoke. By default, all available tools are accessible. Use this to limit access for security or to focus the model on specific capabilities.

Show examples

Array format - List specific tool names:

["search_documents", "list_files", "get_file_content"]

Filter object - Use patterns to include/exclude tools:

{
  "include": ["search_*", "get_*"],
  "exclude": ["delete_*", "update_*"]
}

Approval Settings

require_approval string or object optional · Defaults to always

Specify which MCP server tools require user approval before execution. This provides an additional security layer for tools that perform sensitive operations.

Show approval options

String values:

  • always - All tools require approval before execution (default, recommended)
  • never - No approval required (use with caution, only for trusted servers)

Object format - Fine-grained control with patterns:

{
  "require": ["delete_*", "update_*", "send_*"],
  "allow": ["search_*", "get_*", "list_*"]
}

This configuration requires approval for tools matching delete_*, update_*, or send_* patterns, while allowing search_*, get_*, and list_* tools to execute without approval.

Custom Function Calls

Custom functions defined by you that the model can call with strongly-typed arguments. Enables the model to interact with your application code, APIs, or external services. Learn more

Function object

Show properties

type string required

The type of tool. Always set to function.

name string required

Unique identifier for the function. Used by the model to reference and invoke the function. Must be alphanumeric with underscores (e.g., get_weather, calculate_total).

description string optional

Human-readable explanation of what the function does. The model uses this to determine when and how to call the function. Be specific and clear.

parameters object required

JSON Schema object defining the function's input parameters. Specifies parameter names, types, descriptions, and whether they're required.

strict boolean optional · Defaults to true

Enforce strict parameter validation. When true, the model guarantees parameters match the schema exactly. When false, allows best-effort parameter generation.

require_approval boolean optional · Defaults to false

If true, requires user approval before executing the function. Useful for functions that perform sensitive operations, modify data, or have side effects. When false, the function executes automatically when called by the model.

System Tools

system_tools object optional

Configuration for built-in system tools provided by Aitronos. These extend model capabilities beyond the custom tools in the tools array. System tools are enabled/disabled via mode settings rather than being defined inline.

Show available system tools

image_operations object optional

Enable AI image generation capabilities. When enabled, the model can generate images using DALL-E 2, DALL-E 3, or GPT-Image-1. Learn more

Show properties

mode string required

Controls when image operations are available. Values: on (always enable), off (disable), auto (model decides based on context).

When enabled, the model can call the generate_image tool with these parameters:

ParameterTypeDefaultDescription
promptstringrequiredText description of the desired image
providerstringopenaiProvider: openai or clipdrop
modelstringdall-e-3Model: dall-e-2, dall-e-3, gpt-image-1
ninteger1Number of images (1-10)
sizestring1024x1024Image dimensions
qualitystringstandardQuality: standard, hd, low, medium, high, auto
stylestringvividStyle: vivid, natural
output_formatstringpngFormat: png, webp, jpeg
backgroundstringautoBackground: transparent, opaque, auto
input_imagestring-URL or base64 for editing mode
input_maskstring-Mask for inpainting
input_fidelitystringlowFidelity: low, high
userstring-User ID for tracking

web_search object optional

Enable web search capabilities.

Show properties

mode string required

Values: on, off, auto.

code_interpreter object optional

Enable code execution capabilities.

Show properties

mode string required

Values: on, off, auto.

file_search object optional

Enable file search capabilities.

Show properties

mode string required

Values: on, off, auto.

Example:

{
  "inputs": [{"role": "user", "texts": [{"text": "Generate an image of a sunset over mountains"}]}],
  "model": "ftg-3.0",
  "system_tools": {
    "image_operations": {"mode": "on"}
  }
}

tool_choice string or object optional · Defaults to auto

Controls which tools the model can use during response generation. Use auto to let the model decide, none to disable all tools, required to force tool usage, or specify a particular tool. Learn more

max_tool_calls integer optional · Defaults to 20

Maximum number of built-in tool calls the model can make during response generation. This limit applies to all built-in tools combined (web search, code interpreter, file search, etc.), not per individual tool. Once the limit is reached, any additional tool call attempts are ignored.

parallel_tool_calls boolean optional · Defaults to false

Allow the model to execute multiple tool calls simultaneously rather than sequentially. When enabled, improves performance for tasks requiring multiple independent operations. Disable if tool execution order matters or if you need to control concurrency.

Output Format

output_mode string optional · Defaults to text

Specifies the format of the model's response output. Available modes: text (standard text), json (valid JSON object), json_schema (structured JSON adhering to a specific schema). Learn more

json_schema object optional

Defines the expected structure for json_schema output mode. Required when output_mode is set to json_schema. The model will generate a response that conforms to this schema definition, ensuring type-safe, predictable outputs. Multiple schemas can be provided in a single request; the model selects or combines them based on the described purpose and input context. Learn more

Show properties

id string required

Unique identifier for the JSON schema. Used for referencing and caching schema definitions.

description string required

A description of the schema's purpose. This helps the model determine when to use this schema for the output. For example: "Schema for weather forecast data including temperature, conditions, and location."

schema object required

The JSON Schema definition following JSON Schema Draft 2020-12 specification. Defines the structure, types, and constraints for the expected output.

strict boolean optional · Defaults to true

Enforce strict adherence to the schema. When true, the model guarantees valid output matching the schema. When false, the model attempts best-effort compliance.

Response Parameters

temperature number optional · Defaults to 1.0

Controls output randomness and creativity. Values range from 0.0 to 2.0. Lower values (e.g., 0.2) produce focused, deterministic outputs ideal for factual tasks. Higher values (e.g., 0.8) generate more varied, creative responses suited for brainstorming. Adjust either temperature or top_p, not both.

top_p number optional · Defaults to 1.0

Alternative to temperature for controlling randomness via nucleus sampling. Only the top cumulative probability mass up to top_p is considered. For example, 0.9 means only the top 90% probability mass is used. Values range from 0.0 to 1.0. Adjust either top_p or temperature, not both.

logprobs boolean optional · Defaults to false

Enable log probability output for each generated element. When enabled, the response includes the log probability (confidence score) for each output element, allowing analysis of the model's certainty. Useful for debugging, confidence thresholds, and understanding alternative choices.

top_logprobs integer optional

Number of most likely alternatives to return at each position, along with their log probabilities. Must be between 0 and 20. Only applicable when logprobs is true. Provides insight into what other options the model considered and their relative probabilities.

presence_penalty number optional · Defaults to 0.0

Penalizes content based on whether it appears in the generated text so far, encouraging the model to introduce new topics and concepts. Positive values (up to 2.0) increase novelty; negative values (down to -2.0) encourage staying on topic. Range: -2.0 to 2.0.

frequency_penalty number optional · Defaults to 0.0

Penalizes content based on its frequency in the generated text, reducing repetitive phrasing. Higher values (up to 2.0) decrease repetition; negative values (down to -2.0) allow more repetition. Useful for preventing the model from repeating the same phrases verbatim. Range: -2.0 to 2.0.

truncation string optional · Defaults to disabled

Strategy for handling inputs that exceed the model's context window. Available values: auto (automatically truncate from beginning), disabled (fail with 400 error if exceeded). When set to auto, older messages are dropped to fit within the context window limits.

verbosity string optional · Defaults to medium

Controls the length and detail level of model responses. Lower verbosity produces concise, to-the-point answers ideal for quick information retrieval. Higher verbosity generates comprehensive, detailed explanations suited for learning or thorough analysis. Available values: low, medium, high.

background boolean optional · Defaults to false

Run the model response asynchronously in the background. When enabled, the API returns immediately with a response ID while processing continues. Learn more


Returns

Non-streaming (stream: false): Returns a complete Response object after generation finishes.

Streaming (stream: true): Returns Streaming event objects as Server-Sent Events (SSE). Each event contains a specific event type and data payload showing the model's progress, tool calls, and generated content in real-time.

Bash
curl https://api.aitronos.com/v1/model/response \
  -H "X-API-Key: $FREDDY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "organization_id": "org_abc123",
    "model": "ftg-3.0",
    "inputs": [
      {
        "role": "user",
        "texts": [
          {
            "text": "What is the capital of France?"
          }
        ]
      }
    ]
  }'

Response:

Complete Response object returned after generation finishes:

{
  "success": true,
  "thread_id": "thread_xyz789",
  "response": "The capital of France is Paris. It is located in the north-central part of the country and is known for its art, culture, and iconic landmarks like the Eiffel Tower.",
  "response_id": "resp_abc123",
  "is_summarized": false
}

Working with Files and Images

The Freddy API supports attaching files and images to conversations in two ways:

Method 1: File References (All Models)

Upload files first, then reference them in conversations. The system extracts text content and injects relevant chunks into the conversation context.

Supported models: All models (GPT, Claude, FTG, etc.)

Supported file types:

  • Documents: PDF, DOCX, XLSX, TXT, MD
  • Images: PNG, JPEG, WebP
  • Code files: PY, JS, TS, HTML, CSS, etc.

Workflow:

  1. Upload file via Files API
  2. Reference file_id in the files array (for documents) or images array (for images)
  3. Model receives extracted content as context

Limits:

  • Max 10 files per message
  • Max file size: 50MB (direct upload), unlimited (resumable upload)

View file upload documentation →

Method 2: Vision API for Structured Data (GPT-4o Only)

Extract structured data from images with JSON schema validation.

Supported models: gpt-4o, gpt-4o-mini only

Use cases:

  • Invoice processing
  • Receipt scanning
  • ID card extraction
  • Form data capture

View Vision API documentation →