Computer Call Output
Copy for LLM
Copy page as Markdown for LLMs
View as Markdown
Open this page as Markdown
Open in ChatGPT
Get insights from ChatGPT
Open in Claude
Get insights from Claude

Computer call output handles the results from Computer Use actions — primarily screenshots and other visual results that capture the current state of the interface after each action.

Overview

After each computer use action (click, type, scroll, etc.), the system captures the resulting screen state and returns it as a computer_call_output item. The model uses this to determine its next action.

Output Structure

{
 "type": "computer_call_output",
 "id": "cco_abc123",
 "call_id": "cu_xyz789",
 "output": {
 "type": "computer_screenshot",
 "image_url": "https://storage.aitronos.com/screenshots/screenshot_abc123.png"
 }
}

The image_url contains a temporary URL to the screenshot. URLs expire after 1 hour.

Accessing Screenshots

response = requests.post(
 "https://api.aitronos.com/v1/model/response",
 headers={"X-API-Key": os.environ["FREDDY_API_KEY"]},
 json={
 "organization_id": "org_your_org_id",
 "tools": [{"type": "computer_use_preview"}],
 "inputs": [{"role": "user", "content": "Take a screenshot of the current screen."}],
 },
)

for item in response.json()["output"]:
 if item.get("type") == "computer_call_output":
 image_url = item["output"]["image_url"]
 print(f"Screenshot: {image_url}")

Output Types

Type	Description
`computer_screenshot`	Full screen capture after an action

Modes

Mode	Behavior
`on`	Capture outputs after every action
`off`	Disable output capture
`auto`	Capture when the model requests it

Providing Screenshot Inputs

You can also feed screenshots into the model as inputs, to give it a starting screen state:

{
 "inputs": [
 {
 "role": "user",
 "content": [
 {
 "type": "computer_call_output",
 "output": {
 "type": "computer_screenshot",
 "image_url": "https://your-server.com/screenshot.png"
 }
 },
 {"type": "text", "text": "What do you see on this screen?"}
 ]
 }
 ]
}

Computer Use Preview — Full computer use documentation
System Tools Overview — All available built-in tools
Images and Vision — Image input handling

Computer Call OutputCopyCopy for LLMCopy page as Markdown for LLMsView as MarkdownOpen this page as MarkdownOpen in ChatGPTGet insights from ChatGPTOpen in ClaudeGet insights from Claude