Computer Use Preview
Copy for LLM
Copy page as Markdown for LLMs
View as Markdown
Open this page as Markdown
Open in ChatGPT
Get insights from ChatGPT
Open in Claude
Get insights from Claude

Computer use is an experimental capability that allows the AI to interact with graphical interfaces — clicking, typing, scrolling, and navigating — as if it were a human user. This enables automation of tasks that don't have a programmatic API.

Experimental: Computer use is in preview. Behavior may be unpredictable on complex interfaces. Test thoroughly before using in production workflows.

Enabling Computer Use

{
 "organization_id": "org_your_org_id",
 "assistant_id": "asst_abc123",
 "tools": [{"type": "computer_use_preview"}],
 "inputs": [{"role": "user", "content": "Open the settings page and take a screenshot."}]
}

How It Works

The model receives a screenshot of the current screen state
It decides which action to take (click, type, scroll, etc.)
The action is executed and a new screenshot is captured
This loop continues until the task is complete or the model signals it's done

Each action cycle produces a computer_use_preview output item.

Response Structure

{
 "output": [
 {
 "type": "computer_use_preview",
 "id": "cu_abc123",
 "status": "completed",
 "action": {
 "type": "screenshot"
 },
 "output": {
 "image_url": "https://..."
 }
 }
 ]
}

Supported Actions

Action	Description
`screenshot`	Capture the current screen
`click`	Click at screen coordinates
`double_click`	Double-click at coordinates
`type`	Type text at the current cursor position
`key`	Press a keyboard key or shortcut
`scroll`	Scroll the page
`drag`	Click and drag between coordinates

Tool Modes

Mode	Behavior
`on`	Enable computer use
`off`	Disable computer use

Use Cases

Automating desktop applications without an API
Web scraping and form filling
Testing UI workflows
Capturing screenshots of dynamic content

Safety Considerations

Computer use executes real actions on the host system. Always:

Run in an isolated/sandboxed environment
Review the model's planned actions before execution when possible
Set strict instructions about which applications and URLs are allowed
Never give the model access to sensitive systems or credentials

System Tools Overview — All available built-in tools
Computer Call Output — Handling computer action outputs
Agentic Workflows — Multi-step automation

Computer Use PreviewCopyCopy for LLMCopy page as Markdown for LLMsView as MarkdownOpen this page as MarkdownOpen in ChatGPTGet insights from ChatGPTOpen in ClaudeGet insights from Claude