Threads provide stateful, persistent conversations that automatically maintain context across multiple model responses.
A thread is a container for a multi-turn conversation between users and AI assistants. Instead of manually managing conversation history, threads automatically:
- Store messages: All inputs and outputs are saved
- Maintain context: Previous messages are automatically included in new requests
- Persist state: Conversations survive across sessions
- Enable continuation: Resume conversations anytime with the thread ID
When you create a model response with a thread:
- Thread lookup: Existing messages in the thread are retrieved
- Context injection: Thread messages are prepended to your new inputs
- Processing: The model sees full conversation history
- Auto-save: Your new inputs and outputs are added to the thread
Thread: thread_abc123
├─ Message 1: "Hello" (user)
├─ Message 2: "Hi! How can I help?" (assistant)
├─ Message 3: "What's the weather?" (user) ← New request
└─ Message 4: "I'll check that for you..." (assistant) ← Auto-savedPass a thread ID string. If it doesn't exist, it will be created:
curl https://api.aitronos.com/v1/model/response \
-H "X-API-Key: $FREDDY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4.1",
"thread": "thread_user123_session1",
"inputs": [
{
"role": "user",
"texts": [{"text": "Hello, I need help with Python"}]
}
]
}'Pass an object to create a thread with metadata:
{
"model": "gpt-4.1",
"thread": {
"id": "thread_user123_session1",
"metadata": {
"userId": "user_123",
"sessionId": "sess_456",
"topic": "python_help"
}
},
"inputs": [
{
"role": "user",
"texts": [{"text": "Hello, I need help with Python"}]
}
]
}Create an empty thread first, then use it:
# Create thread
curl https://api.aitronos.com/v1/threads \
-H "X-API-Key: $FREDDY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"metadata": {
"userId": "user_123",
"purpose": "support_chat"
}
}'
# Use thread in responses
curl https://api.aitronos.com/v1/model/response \
-H "X-API-Key: $FREDDY_API_KEY" \
-d '{
"model": "gpt-4.1",
"thread": "thread_67cc8901",
"inputs": [...]
}'Simply reference the same thread ID:
// First message
const response1 = await fetch('/v1/model/response', {
method: 'POST',
headers: { 'X-API-Key': apiKey },
body: JSON.stringify({
model: 'gpt-4.1',
thread: 'thread_chat_001',
inputs: [{ role: 'user', texts: [{ text: 'What is Python?' }] }]
})
});
// Continue conversation (history is automatic)
const response2 = await fetch('/v1/model/response', {
method: 'POST',
headers: { 'X-API-Key': apiKey },
body: JSON.stringify({
model: 'gpt-4.1',
thread: 'thread_chat_001', // Same thread ID
inputs: [{ role: 'user', texts: [{ text: 'Can you show an example?' }] }]
})
});
// Model automatically sees the previous Q&A about PythonThreads have a lifecycle state that controls how they can be used:
open (default)
- Accepts new messages and responses
- Thread can be modified and extended
- Use for active conversations
locked
- Read-only mode - no new messages allowed
- Existing messages remain accessible
- Useful for preserving completed conversations
- Can be unlocked back to
open
archived
- Preserved but inactive
- Cannot add messages (permanent)
- Still accessible for reading
- Use for long-term storage of resolved conversations
# Lock a thread to prevent modifications
curl https://api.aitronos.com/v1/model/response \
-H "X-API-Key: $FREDDY_API_KEY" \
-d '{
"model": "gpt-4.1",
"thread": "thread_abc123",
"threadState": "locked",
"inputs": [...]
}'
# Archive a completed conversation
curl https://api.aitronos.com/v1/threads/thread_abc123 \
-X PATCH \
-H "X-API-Key: $FREDDY_API_KEY" \
-d '{"state": "archived"}'open ──────────────────> locked
↑ ↓
└──────── unlock ──────────┘
open ──────────────────> archived (permanent)
locked ────────────────> archived (permanent)Lock threads when:
- Conversation is complete but may need reference
- Compliance requires immutable records
- Support ticket is resolved but not ready for archival
- Quality review is in progress
Archive threads when:
- Conversation will never be continued
- Long-term storage is needed
- Freeing up active thread capacity
- Compliance retention period begins
Control whether threads appear in the user's thread list using the store parameter when creating model responses.
By default, all threads are visible in the list threads API (store=true):
curl https://api.aitronos.com/v1/model/response \
-H "X-API-Key: $FREDDY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"organization_id": "org_abc123",
"model": "ftg-3.0",
"store": true,
"inputs": [
{
"role": "user",
"texts": [{"text": "Hello"}]
}
]
}'Create threads that don't appear in the list threads API by setting store=false:
curl https://api.aitronos.com/v1/model/response \
-H "X-API-Key: $FREDDY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"organization_id": "org_abc123",
"model": "ftg-3.0-speed",
"store": false,
"inputs": [
{
"role": "user",
"texts": [{"text": "Generate a title for this conversation"}]
}
]
}'Hidden threads are:
- ✅ Created and stored in the database
- ✅ Accessible by direct thread ID lookup
- ✅ Fully functional for all thread operations
- ❌ Not visible in the list threads API (by default)
- ❌ Not shown in the user interface
Internal Operations
# Generate thread name without creating visible thread
response = requests.post(
'https://api.aitronos.com/v1/model/response',
headers={'Authorization': f'Bearer {api_key}'},
json={
'organization_id': 'org_abc123',
'model': 'ftg-3.0-speed',
'store': False, # Hidden thread
'inputs': [
{'role': 'user', 'texts': [{'text': 'Generate a title'}]}
]
}
)Background Tasks
// Process data without cluttering thread list
const response = await fetch('/v1/model/response', {
method: 'POST',
body: JSON.stringify({
organization_id: 'org_abc123',
model: 'ftg-3.0',
store: false, // Hidden thread
inputs: [
{ role: 'user', texts: [{ text: 'Analyze this dataset' }] }
]
})
});Temporary Threads
# Create ephemeral thread for one-time operation
def generate_summary(content):
response = requests.post(
'https://api.aitronos.com/v1/model/response',
json={
'organization_id': 'org_abc123',
'model': 'ftg-3.0',
'store': False, # Won't appear in user's thread list
'inputs': [
{'role': 'user', 'texts': [{'text': f'Summarize: {content}'}]}
]
}
)
return response.json()Hidden threads can still be accessed directly by ID:
# Get hidden thread by ID
curl https://api.aitronos.com/v1/threads/thread_abc123 \
-H "X-API-Key: $FREDDY_API_KEY"
# List hidden threads explicitly
curl https://api.aitronos.com/v1/threads?visible_in_ui=false \
-H "X-API-Key: $FREDDY_API_KEY"Use store=true (default) for:
- User-facing conversations
- Customer support chats
- Multi-turn dialogues
- Conversations that need to be resumed
Use store=false for:
- Internal operations (name generation, summaries)
- Background processing
- Temporary or ephemeral threads
- System-level tasks
- Automated workflows
For scenarios where you don't want persistent threads, use previousResponseId to chain responses:
// First message
const response1 = await fetch('/v1/model/response', {
method: 'POST',
body: JSON.stringify({
model: 'gpt-4.1',
inputs: [{ role: 'user', texts: [{ text: 'Hello' }] }]
})
});
const data1 = await response1.json();
// Continue without thread - just reference previous response
const response2 = await fetch('/v1/model/response', {
method: 'POST',
body: JSON.stringify({
model: 'gpt-4.1',
previousResponseId: data1.id, // Link to previous response
inputs: [{ role: 'user', texts: [{ text: 'Tell me more' }] }]
})
});
// Model sees context from response1 without creating a persistent threadWhen to use previousResponseId:
- Short, ephemeral conversations
- Zero data retention requirements
- Single-session workflows
- Testing without polluting thread storage
Limitations:
- Cannot reference responses older than 24 hours
- No persistent history (only previous response)
- Cannot be combined with
threadparameter
curl https://api.aitronos.com/v1/threads/thread_67cc8901/messages \
-H "X-API-Key: $FREDDY_API_KEY"curl https://api.aitronos.com/v1/threads/thread_67cc8901 \
-X PATCH \
-H "X-API-Key: $FREDDY_API_KEY" \
-d '{
"metadata": {
"status": "resolved",
"rating": 5
}
}'curl https://api.aitronos.com/v1/threads/thread_67cc8901 \
-X DELETE \
-H "X-API-Key: $FREDDY_API_KEY"def handle_support_message(user_id, message):
thread_id = f"support_{user_id}"
response = requests.post(
'https://api.aitronos.com/v1/model/response',
headers={'Authorization': f'Bearer {api_key}'},
json={
'model': 'gpt-4.1',
'thread': {
'id': thread_id,
'metadata': {
'userId': user_id,
'department': 'support',
'priority': 'normal'
}
},
'inputs': [
{'role': 'user', 'texts': [{'text': message}]}
]
}
)
return response.json()class ConversationManager {
constructor(userId) {
this.threadId = `user_${userId}_${Date.now()}`;
}
async sendMessage(text) {
const response = await fetch('/v1/model/response', {
method: 'POST',
body: JSON.stringify({
model: 'gpt-4.1',
thread: this.threadId,
inputs: [{ role: 'user', texts: [{ text }] }]
})
});
return await response.json();
}
getThreadId() {
return this.threadId;
}
}
// Usage
const chat = new ConversationManager('user_123');
await chat.sendMessage('Hello');
await chat.sendMessage('Tell me more'); // Remembers "Hello" contextdef onboarding_workflow(user_id, step, user_input):
thread_id = f"onboarding_{user_id}"
# System prompt sets the context for the thread
system_message = {
'role': 'system',
'texts': [{'text': 'You are helping a user through an onboarding process. Current step: ' + step}]
}
response = requests.post(
'https://api.aitronos.com/v1/model/response',
json={
'model': 'gpt-4.1',
'thread': thread_id,
'inputs': [
system_message,
{'role': 'user', 'texts': [{'text': user_input}]}
]
}
)
return response.json()- Maximum messages per thread: 10,000
- Message retention: 90 days of inactivity
- Max thread age: 1 year
- Metadata size: 16KB per thread
// ✅ Good - descriptive and unique
const threadId = `support_${userId}_${ticketId}`;
const threadId = `chat_${sessionId}_${timestamp}`;
// ❌ Bad - too generic or not unique
const threadId = 'thread1';
const threadId = userId; // May conflict across contexts{
"thread": {
"id": "support_user123_ticket789",
"metadata": {
"userId": "user_123",
"ticketId": "ticket_789",
"category": "billing",
"priority": "high",
"assignedTo": "agent_45",
"created": "2025-01-06T10:30:00Z"
}
}
}Periodically delete old or resolved threads:
def cleanup_old_threads():
# Get threads older than 90 days
old_threads = get_threads(older_than=90)
for thread_id in old_threads:
requests.delete(f'/v1/threads/{thread_id}')async function sendMessage(threadId, message) {
try {
const response = await fetch('/v1/model/response', {
method: 'POST',
body: JSON.stringify({
model: 'gpt-4.1',
thread: threadId,
inputs: [{ role: 'user', texts: [{ text: message }] }]
})
});
if (response.status === 404) {
// Thread was deleted or expired - create new one
return await sendMessage(createNewThreadId(), message);
}
return await response.json();
} catch (error) {
console.error('Thread error:', error);
}
}// Simple - thread handles history
await sendMessage(threadId, 'Hello');
await sendMessage(threadId, 'How are you?'); // Context automatic// Complex - must track history yourself
let history = [];
async function sendMessage(text) {
const userMsg = { role: 'user', texts: [{ text }] };
history.push(userMsg);
const response = await fetch('/v1/model/response', {
body: JSON.stringify({
model: 'gpt-4.1',
inputs: history // Must send full history each time
})
});
const result = await response.json();
history.push(result.output[0]); // Must save assistant response
return result;
}