Threads
Copy for LLM
Copy page as Markdown for LLMs
View as Markdown
Open this page as Markdown
Open in ChatGPT
Get insights from ChatGPT
Open in Claude
Get insights from Claude

Threads provide stateful, persistent conversations that automatically maintain context across multiple model responses.

Overview

A thread is a container for a multi-turn conversation between users and AI assistants. Instead of manually managing conversation history, threads automatically:

Store messages: All inputs and outputs are saved
Maintain context: Previous messages are automatically included in new requests
Persist state: Conversations survive across sessions
Enable continuation: Resume conversations anytime with the thread ID

How Threads Work

When you create a model response with a thread:

Thread lookup: Existing messages in the thread are retrieved
Context injection: Thread messages are prepended to your new inputs
Processing: The model sees full conversation history
Auto-save: Your new inputs and outputs are added to the thread

Thread: thrd_abc123
├─ Message 1: "Hello" (user)
├─ Message 2: "Hi! How can I help?" (assistant)
├─ Message 3: "What's the weather?" (user) ← New request
└─ Message 4: "I'll check that for you..." (assistant) ← Auto-saved

Creating a Thread

Option 1: Auto-Create with String ID

Pass a thread ID string. If it doesn't exist, it will be created:

curl https://api.aitronos.com/v1/model/response \
 -H "X-API-Key: $FREDDY_API_KEY" \
 -H "Content-Type: application/json" \
 -d '{
 "model": "gpt-4o",
 "thread": "thrd_user123_session1",
 "inputs": [
 {
 "role": "user",
 "content": "Hello, I need help with Python"
 }
 ]
 }'

Option 2: Create with Metadata

Pass an object to create a thread with metadata:

{
 "model": "gpt-4o",
 "thread": {
 "id": "thrd_user123_session1",
 "metadata": {
 "userId": "user_123",
 "sessionId": "sess_456",
 "topic": "python_help"
 }
 },
 "inputs": [
 {
 "role": "user",
 "content": "Hello, I need help with Python"
 }
 ]
}

Option 3: Thread-First Creation

Create an empty thread first, then use it:

# Create thread
curl https://api.aitronos.com/v1/threads \
 -H "X-API-Key: $FREDDY_API_KEY" \
 -H "Content-Type: application/json" \
 -d '{
 "metadata": {
 "userId": "user_123",
 "purpose": "support_chat"
 }
 }'

# Use thread in responses
curl https://api.aitronos.com/v1/model/response \
 -H "X-API-Key: $FREDDY_API_KEY" \
 -d '{
 "model": "gpt-4o",
 "thread": "thrd_67cc8901",
 "inputs": [...]
 }'

Continuing a Conversation

Simply reference the same thread ID:

// First message
const response1 = await fetch('/v1/model/response', {
 method: 'POST',
 headers: { 'X-API-Key': apiKey },
 body: JSON.stringify({
 model: 'gpt-4o',
 thread: 'thrd_chat_001',
 inputs: [{ role: 'user', texts: [{ text: 'What is Python?' }] }]
 })
});

// Continue conversation (history is automatic)
const response2 = await fetch('/v1/model/response', {
 method: 'POST',
 headers: { 'X-API-Key': apiKey },
 body: JSON.stringify({
 model: 'gpt-4o',
 thread: 'thrd_chat_001', // Same thread ID
 inputs: [{ role: 'user', texts: [{ text: 'Can you show an example?' }] }]
 })
});
// Model automatically sees the previous Q&A about Python

Thread States

Threads have a lifecycle state that controls how they can be used:

Available States

open (default)

Accepts new messages and responses
Thread can be modified and extended
Use for active conversations

locked

Read-only mode - no new messages allowed
Existing messages remain accessible
Useful for preserving completed conversations
Can be unlocked back to open

archived

Preserved but inactive
Cannot add messages (permanent)
Still accessible for reading
Use for long-term storage of resolved conversations

Setting Thread State

# Lock a thread to prevent modifications
curl https://api.aitronos.com/v1/model/response \
 -H "X-API-Key: $FREDDY_API_KEY" \
 -d '{
 "model": "gpt-4o",
 "thread": "thrd_abc123",
 "threadState": "locked",
 "inputs": [...]
 }'

# Archive a completed conversation
curl https://api.aitronos.com/v1/threads/thrd_abc123 \
 -X PATCH \
 -H "X-API-Key: $FREDDY_API_KEY" \
 -d '{"state": "archived"}'

State Transitions

open ──────────────────> locked
 ↑ ↓
 └──────── unlock ──────────┘

open ──────────────────> archived (permanent)
locked ────────────────> archived (permanent)

Use Cases

Lock threads when:

Conversation is complete but may need reference
Compliance requires immutable records
Support ticket is resolved but not ready for archival
Quality review is in progress

Archive threads when:

Conversation will never be continued
Long-term storage is needed
Freeing up active thread capacity
Compliance retention period begins

Thread Visibility

Control whether threads appear in the user's thread list using the store parameter when creating model responses.

Visible Threads (Default)

By default, all threads are visible in the list threads API (store=true):

curl https://api.aitronos.com/v1/model/response \
 -H "X-API-Key: $FREDDY_API_KEY" \
 -H "Content-Type: application/json" \
 -d '{
 "organization_id": "org_abc123",
 "model": "gpt-4o",
 "store": true,
 "inputs": [
 {
 "role": "user",
 "content": "Hello"
 }
 ]
 }'

Hidden Threads

Create threads that don't appear in the list threads API by setting store=false:

curl https://api.aitronos.com/v1/model/response \
 -H "X-API-Key: $FREDDY_API_KEY" \
 -H "Content-Type: application/json" \
 -d '{
 "organization_id": "org_abc123",
 "model": "gpt-4o",
 "store": false,
 "inputs": [
 {
 "role": "user",
 "content": "Generate a title for this conversation"
 }
 ]
 }'

Hidden threads are:

Created and stored in the database
Accessible by direct thread ID lookup
Fully functional for all thread operations
Not visible in the list threads API (by default)
Not shown in the user interface

Use Cases for Hidden Threads

Internal Operations

# Generate thread name without creating visible thread
response = requests.post(
 'https://api.aitronos.com/v1/model/response',
 headers={'X-API-Key': api_key},
 json={
 'organization_id': 'org_abc123',
 'model': 'gpt-4o',
 'store': False, # Hidden thread
 'inputs': [
 {'role': 'user', 'texts': [{'text': 'Generate a title'}]}
 ]
 }
)

Background Tasks

// Process data without cluttering thread list
const response = await fetch('/v1/model/response', {
 method: 'POST',
 body: JSON.stringify({
 organization_id: 'org_abc123',
 model: 'gpt-4o',
 store: false, // Hidden thread
 inputs: [
 { role: 'user', texts: [{ text: 'Analyze this dataset' }] }
 ]
 })
});

Temporary Threads

# Create ephemeral thread for one-time operation
def generate_summary(content):
 response = requests.post(
 'https://api.aitronos.com/v1/model/response',
 json={
 'organization_id': 'org_abc123',
 'model': 'gpt-4o',
 'store': False, # Won't appear in user's thread list
 'inputs': [
 {'role': 'user', 'texts': [{'text': f'Summarize: {content}'}]}
 ]
 }
 )
 return response.json()

Accessing Hidden Threads

Hidden threads can still be accessed directly by ID:

# Get hidden thread by ID
curl https://api.aitronos.com/v1/threads/thrd_abc123 \
 -H "X-API-Key: $FREDDY_API_KEY"

# List hidden threads explicitly
curl https://api.aitronos.com/v1/threads?visible_in_ui=false \
 -H "X-API-Key: $FREDDY_API_KEY"

Best Practices

Use store=true (default) for:

User-facing conversations
Customer support chats
Multi-turn dialogues
Conversations that need to be resumed

Use store=false for:

Internal operations (name generation, summaries)
Background processing
Temporary or ephemeral threads
System-level tasks
Automated workflows

Stateless Conversations

For scenarios where you don't want persistent threads, use previousResponseId to chain responses:

// First message
const response1 = await fetch('/v1/model/response', {
 method: 'POST',
 body: JSON.stringify({
 model: 'gpt-4o',
 inputs: [{ role: 'user', texts: [{ text: 'Hello' }] }]
 })
});

const data1 = await response1.json();

// Continue without thread - just reference previous response
const response2 = await fetch('/v1/model/response', {
 method: 'POST',
 body: JSON.stringify({
 model: 'gpt-4o',
 previousResponseId: data1.id, // Link to previous response
 inputs: [{ role: 'user', texts: [{ text: 'Tell me more' }] }]
 })
});
// Model sees context from response1 without creating a persistent thread

When to use previousResponseId:

Short, ephemeral conversations
Zero data retention requirements
Single-session workflows
Testing without polluting thread storage

Limitations:

Cannot reference responses older than 24 hours
No persistent history (only previous response)
Cannot be combined with thread parameter

Thread Management

Retrieve Thread Messages

curl https://api.aitronos.com/v1/threads/thrd_67cc8901/messages \
 -H "X-API-Key: $FREDDY_API_KEY"

Update Thread Metadata

curl https://api.aitronos.com/v1/threads/thrd_67cc8901 \
 -X PATCH \
 -H "X-API-Key: $FREDDY_API_KEY" \
 -d '{
 "metadata": {
 "status": "resolved",
 "rating": 5
 }
 }'

Delete Thread

curl https://api.aitronos.com/v1/threads/thrd_67cc8901 \
 -X DELETE \
 -H "X-API-Key: $FREDDY_API_KEY"

Use Cases

Customer Support Chat

def handle_support_message(user_id, message):
 thread_id = f"support_{user_id}"

 response = requests.post(
 'https://api.aitronos.com/v1/model/response',
 headers={'X-API-Key': api_key},
 json={
 'model': 'gpt-4o',
 'thread': {
 'id': thread_id,
 'metadata': {
 'userId': user_id,
 'department': 'support',
 'priority': 'normal'
 }
 },
 'inputs': [
 {'role': 'user', 'texts': [{'text': message}]}
 ]
 }
 )
 return response.json()

Multi-Session Conversations

class ConversationManager {
 constructor(userId) {
 this.threadId = `user_${userId}_${Date.now()}`;
 }

 async sendMessage(text) {
 const response = await fetch('/v1/model/response', {
 method: 'POST',
 body: JSON.stringify({
 model: 'gpt-4o',
 thread: this.threadId,
 inputs: [{ role: 'user', texts: [{ text }] }]
 })
 });
 return await response.json();
 }

 getThreadId() {
 return this.threadId;
 }
}

// Usage
const chat = new ConversationManager('user_123');
await chat.sendMessage('Hello');
await chat.sendMessage('Tell me more'); // Remembers "Hello" context

Guided Workflows

def onboarding_workflow(user_id, step, user_input):
 thread_id = f"onboarding_{user_id}"

 # System prompt sets the context for the thread
 system_message = {
 'role': 'system',
 'texts': [{'text': 'You are helping a user through an onboarding process. Current step: ' + step}]
 }

 response = requests.post(
 'https://api.aitronos.com/v1/model/response',
 json={
 'model': 'gpt-4o',
 'thread': thread_id,
 'inputs': [
 system_message,
 {'role': 'user', 'texts': [{'text': user_input}]}
 ]
 }
 )

 return response.json()

Thread Limits

Maximum messages per thread: 10,000
Message retention: 90 days of inactivity
Max thread age: 1 year
Metadata size: 16KB per thread

Best Practices

Use Descriptive Thread IDs

// Good - descriptive and unique
const threadId = `support_${userId}_${ticketId}`;
const threadId = `chat_${sessionId}_${timestamp}`;

// Bad - too generic or not unique
const threadId = 'thread1';
const threadId = userId; // May conflict across contexts

Add Meaningful Metadata

{
 "thread": {
 "id": "support_user123_ticket789",
 "metadata": {
 "userId": "user_123",
 "ticketId": "ticket_789",
 "category": "billing",
 "priority": "high",
 "assignedTo": "agent_45",
 "created": "2025-01-06T10:30:00Z"
 }
 }
}

Implement Thread Cleanup

Periodically delete old or resolved threads:

def cleanup_old_threads():
 # Get threads older than 90 days
 old_threads = get_threads(older_than=90)

 for thread_id in old_threads:
 requests.delete(f'/v1/threads/{thread_id}')

Handle Thread Not Found

async function sendMessage(threadId, message) {
 try {
 const response = await fetch('/v1/model/response', {
 method: 'POST',
 body: JSON.stringify({
 model: 'gpt-4o',
 thread: threadId,
 inputs: [{ role: 'user', texts: [{ text: message }] }]
 })
 });

 if (response.status === 404) {
 // Thread was deleted or expired - create new one
 return await sendMessage(createNewThreadId(), message);
 }

 return await response.json();
 } catch (error) {
 console.error('Thread error:', error);
 }
}

Threads vs Manual History

With Threads (Automatic)

// Simple - thread handles history
await sendMessage(threadId, 'Hello');
await sendMessage(threadId, 'How are you?'); // Context automatic

Without Threads (Manual)

// Complex - must track history yourself
let history = [];

async function sendMessage(text) {
 const userMsg = { role: 'user', texts: [{ text }] };
 history.push(userMsg);

 const response = await fetch('/v1/model/response', {
 body: JSON.stringify({
 model: 'gpt-4o',
 inputs: history // Must send full history each time
 })
 });

 const result = await response.json();
 history.push(result.output[0]); // Must save assistant response
 return result;
}

ThreadsCopyCopy for LLMCopy page as Markdown for LLMsView as MarkdownOpen this page as MarkdownOpen in ChatGPTGet insights from ChatGPTOpen in ClaudeGet insights from Claude

Overview

How Threads Work

Creating a Thread

Option 1: Auto-Create with String ID

Option 2: Create with Metadata

Option 3: Thread-First Creation

Continuing a Conversation

Thread States

Available States

Setting Thread State

State Transitions

Use Cases

Thread Visibility

Visible Threads (Default)

Hidden Threads

Use Cases for Hidden Threads

Accessing Hidden Threads

Best Practices

Stateless Conversations

Thread Management

Retrieve Thread Messages

Update Thread Metadata

Delete Thread

Use Cases

Customer Support Chat

Multi-Session Conversations

Guided Workflows

Thread Limits

Best Practices

Use Descriptive Thread IDs

Add Meaningful Metadata

Implement Thread Cleanup

Handle Thread Not Found

Threads vs Manual History

With Threads (Automatic)

Without Threads (Manual)

Related Resources

Was this helpful?

Threads
Copy for LLM
Copy page as Markdown for LLMs
View as Markdown
Open this page as Markdown
Open in ChatGPT
Get insights from ChatGPT
Open in Claude
Get insights from Claude