Cancels an in-progress streaming response with intelligent behavior based on timing.
Stops a streaming response that is currently being generated. The cancellation behavior depends on how long the stream has been running:
- Quick Edit Mode (< 5 seconds): Both user message and assistant response are deleted. Frontend should restore the message to input.
- Keep Partial Mode (≥ 5 seconds): User message remains, partial response is saved with
cancelled: truemetadata.
thread_id string required
The thread ID containing the active stream to cancel.
success boolean
Whether the cancellation was successful.
cancel_mode string
The cancellation mode applied: quick_edit or keep_partial.
elapsed_seconds number
How long the stream was running before cancellation.
user_message_deleted boolean
Whether the user's message was deleted (true in quick_edit mode).
assistant_response_deleted boolean
Whether the assistant's response was deleted (true in quick_edit mode).
partial_content_saved boolean
Whether partial content was preserved (true in keep_partial mode).
When cancelled within 5 seconds of starting:
- Both user message and assistant response are deleted from the thread
- Frontend should restore the user's message to the input field
- Attachments should be restored
- Chat appears as if the message was never sent
Use case: User realizes they made a typo or want to rephrase immediately.
When cancelled after 5 seconds:
- User message remains in the thread
- Partial assistant response is saved with
cancelled: truemetadata - Response stays visible in the chat
- No message restoration to input
Use case: User got enough information or wants to stop a long response.
When a stream is cancelled, a response.cancelled event is emitted instead of response.completed:
{
"event": "response.cancelled",
"status": "cancelled",
"thread_id": "thrd_abc123",
"partial_content_length": 1234
}- Idempotent: Can be called multiple times safely after first call
- Usage Tracking: Tokens generated before cancellation are still tracked for billing
- Tool Calls: Current tool execution completes, but no further tools are called
- All Output Modes: Works with text, json, blocks, and plain modes
- Bash
- Python
- JavaScript
curl -X POST https://api.aitronos.com/v1/model/response/thrd_abc123/cancel \
-H "X-API-Key: $FREDDY_API_KEY"Response:
Response when cancelled within 5 seconds:
{
"success": true,
"cancel_mode": "quick_edit",
"elapsed_seconds": 2.3,
"user_message_deleted": true,
"assistant_response_deleted": true,
"partial_content_saved": false
}