Skip to content
Last updated

Cancels an in-progress streaming response with intelligent behavior based on timing.

POSThttps://api.aitronos.com/v1/model/response/{thread_id}/cancel

Stops a streaming response that is currently being generated. The cancellation behavior depends on how long the stream has been running:

  • Quick Edit Mode (< 5 seconds): Both user message and assistant response are deleted. Frontend should restore the message to input.
  • Keep Partial Mode (≥ 5 seconds): User message remains, partial response is saved with cancelled: true metadata.

Path Parameters

thread_id string required

The thread ID containing the active stream to cancel.

Response

success boolean

Whether the cancellation was successful.

cancel_mode string

The cancellation mode applied: quick_edit or keep_partial.

elapsed_seconds number

How long the stream was running before cancellation.

user_message_deleted boolean

Whether the user's message was deleted (true in quick_edit mode).

assistant_response_deleted boolean

Whether the assistant's response was deleted (true in quick_edit mode).

partial_content_saved boolean

Whether partial content was preserved (true in keep_partial mode).


Cancellation Modes

Quick Edit Mode (< 5 seconds)

When cancelled within 5 seconds of starting:

  • Both user message and assistant response are deleted from the thread
  • Frontend should restore the user's message to the input field
  • Attachments should be restored
  • Chat appears as if the message was never sent

Use case: User realizes they made a typo or want to rephrase immediately.

Keep Partial Mode (≥ 5 seconds)

When cancelled after 5 seconds:

  • User message remains in the thread
  • Partial assistant response is saved with cancelled: true metadata
  • Response stays visible in the chat
  • No message restoration to input

Use case: User got enough information or wants to stop a long response.


SSE Events

When a stream is cancelled, a response.cancelled event is emitted instead of response.completed:

{
  "event": "response.cancelled",
  "status": "cancelled",
  "thread_id": "thrd_abc123",
  "partial_content_length": 1234
}

Notes

  • Idempotent: Can be called multiple times safely after first call
  • Usage Tracking: Tokens generated before cancellation are still tracked for billing
  • Tool Calls: Current tool execution completes, but no further tools are called
  • All Output Modes: Works with text, json, blocks, and plain modes
Bash
curl -X POST https://api.aitronos.com/v1/model/response/thrd_abc123/cancel \
  -H "X-API-Key: $FREDDY_API_KEY"

Response:

Response when cancelled within 5 seconds:

{
  "success": true,
  "cancel_mode": "quick_edit",
  "elapsed_seconds": 2.3,
  "user_message_deleted": true,
  "assistant_response_deleted": true,
  "partial_content_saved": false
}