Skip to content
Last updated

🔨 In Development — This section is still being developed and may change.
Threads provide stateful, persistent conversations that automatically maintain context across multiple model responses.

Overview

A thread is a container for a multi-turn conversation between users and AI assistants. Instead of manually managing conversation history, threads automatically:

  • Store messages: All inputs and outputs are saved
  • Maintain context: Previous messages are automatically included in new requests
  • Persist state: Conversations survive across sessions
  • Enable continuation: Resume conversations anytime with the thread ID

How Threads Work

When you create a model response with a thread:

  1. Thread lookup: Existing messages in the thread are retrieved
  2. Context injection: Thread messages are prepended to your new inputs
  3. Processing: The model sees full conversation history
  4. Auto-save: Your new inputs and outputs are added to the thread
Thread: thread_abc123
├─ Message 1: "Hello" (user)
├─ Message 2: "Hi! How can I help?" (assistant)
├─ Message 3: "What's the weather?" (user) ← New request
└─ Message 4: "I'll check that for you..." (assistant) ← Auto-saved

Creating a Thread

Option 1: Auto-Create with String ID

Pass a thread ID string. If it doesn't exist, it will be created:

curl https://api.freddy.aitronos.com/v1/model/response \
  -H "Authorization: Bearer $FREDDY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "thread": "thread_user123_session1",
    "inputs": [
      {
        "role": "user",
        "texts": [{"text": "Hello, I need help with Python"}]
      }
    ]
  }'

Option 2: Create with Metadata

Pass an object to create a thread with metadata:

{
  "model": "gpt-4.1",
  "thread": {
    "id": "thread_user123_session1",
    "metadata": {
      "userId": "user_123",
      "sessionId": "sess_456",
      "topic": "python_help"
    }
  },
  "inputs": [
    {
      "role": "user",
      "texts": [{"text": "Hello, I need help with Python"}]
    }
  ]
}

Option 3: Thread-First Creation

Create an empty thread first, then use it:

# Create thread
curl https://api.freddy.aitronos.com/v1/threads \
  -H "Authorization: Bearer $FREDDY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metadata": {
      "userId": "user_123",
      "purpose": "support_chat"
    }
  }'

# Use thread in responses
curl https://api.freddy.aitronos.com/v1/model/response \
  -H "Authorization: Bearer $FREDDY_API_KEY" \
  -d '{
    "model": "gpt-4.1",
    "thread": "thread_67cc8901",
    "inputs": [...]
  }'

Continuing a Conversation

Simply reference the same thread ID:

// First message
const response1 = await fetch('/v1/model/response', {
  method: 'POST',
  headers: { 'Authorization': `Bearer ${apiKey}` },
  body: JSON.stringify({
    model: 'gpt-4.1',
    thread: 'thread_chat_001',
    inputs: [{ role: 'user', texts: [{ text: 'What is Python?' }] }]
  })
});

// Continue conversation (history is automatic)
const response2 = await fetch('/v1/model/response', {
  method: 'POST',
  headers: { 'Authorization': `Bearer ${apiKey}` },
  body: JSON.stringify({
    model: 'gpt-4.1',
    thread: 'thread_chat_001', // Same thread ID
    inputs: [{ role: 'user', texts: [{ text: 'Can you show an example?' }] }]
  })
});
// Model automatically sees the previous Q&A about Python

Thread States

Threads have a lifecycle state that controls how they can be used:

Available States

open (default)

  • Accepts new messages and responses
  • Thread can be modified and extended
  • Use for active conversations

locked

  • Read-only mode - no new messages allowed
  • Existing messages remain accessible
  • Useful for preserving completed conversations
  • Can be unlocked back to open

archived

  • Preserved but inactive
  • Cannot add messages (permanent)
  • Still accessible for reading
  • Use for long-term storage of resolved conversations

Setting Thread State

# Lock a thread to prevent modifications
curl https://api.freddy.aitronos.com/v1/model/response \
  -H "Authorization: Bearer $FREDDY_API_KEY" \
  -d '{
    "model": "gpt-4.1",
    "thread": "thread_abc123",
    "threadState": "locked",
    "inputs": [...]
  }'

# Archive a completed conversation
curl https://api.freddy.aitronos.com/v1/threads/thread_abc123 \
  -X PATCH \
  -H "Authorization: Bearer $FREDDY_API_KEY" \
  -d '{"state": "archived"}'

State Transitions

open ──────────────────> locked
 ↑                          ↓
 └──────── unlock ──────────┘

open ──────────────────> archived (permanent)
locked ────────────────> archived (permanent)

Use Cases

Lock threads when:

  • Conversation is complete but may need reference
  • Compliance requires immutable records
  • Support ticket is resolved but not ready for archival
  • Quality review is in progress

Archive threads when:

  • Conversation will never be continued
  • Long-term storage is needed
  • Freeing up active thread capacity
  • Compliance retention period begins

Stateless Conversations

For scenarios where you don't want persistent threads, use previousResponseId to chain responses:

// First message
const response1 = await fetch('/v1/model/response', {
  method: 'POST',
  body: JSON.stringify({
    model: 'gpt-4.1',
    inputs: [{ role: 'user', texts: [{ text: 'Hello' }] }]
  })
});

const data1 = await response1.json();

// Continue without thread - just reference previous response
const response2 = await fetch('/v1/model/response', {
  method: 'POST',
  body: JSON.stringify({
    model: 'gpt-4.1',
    previousResponseId: data1.id, // Link to previous response
    inputs: [{ role: 'user', texts: [{ text: 'Tell me more' }] }]
  })
});
// Model sees context from response1 without creating a persistent thread

When to use previousResponseId:

  • Short, ephemeral conversations
  • Zero data retention requirements
  • Single-session workflows
  • Testing without polluting thread storage

Limitations:

  • Cannot reference responses older than 24 hours
  • No persistent history (only previous response)
  • Cannot be combined with thread parameter

Thread Management

Retrieve Thread Messages

curl https://api.freddy.aitronos.com/v1/threads/thread_67cc8901/messages \
  -H "Authorization: Bearer $FREDDY_API_KEY"

Update Thread Metadata

curl https://api.freddy.aitronos.com/v1/threads/thread_67cc8901 \
  -X PATCH \
  -H "Authorization: Bearer $FREDDY_API_KEY" \
  -d '{
    "metadata": {
      "status": "resolved",
      "rating": 5
    }
  }'

Delete Thread

curl https://api.freddy.aitronos.com/v1/threads/thread_67cc8901 \
  -X DELETE \
  -H "Authorization: Bearer $FREDDY_API_KEY"

Use Cases

Customer Support Chat

def handle_support_message(user_id, message):
    thread_id = f"support_{user_id}"
    
    response = requests.post(
        'https://api.freddy.aitronos.com/v1/model/response',
        headers={'Authorization': f'Bearer {api_key}'},
        json={
            'model': 'gpt-4.1',
            'thread': {
                'id': thread_id,
                'metadata': {
                    'userId': user_id,
                    'department': 'support',
                    'priority': 'normal'
                }
            },
            'inputs': [
                {'role': 'user', 'texts': [{'text': message}]}
            ]
        }
    )
    return response.json()

Multi-Session Conversations

class ConversationManager {
  constructor(userId) {
    this.threadId = `user_${userId}_${Date.now()}`;
  }
  
  async sendMessage(text) {
    const response = await fetch('/v1/model/response', {
      method: 'POST',
      body: JSON.stringify({
        model: 'gpt-4.1',
        thread: this.threadId,
        inputs: [{ role: 'user', texts: [{ text }] }]
      })
    });
    return await response.json();
  }
  
  getThreadId() {
    return this.threadId;
  }
}

// Usage
const chat = new ConversationManager('user_123');
await chat.sendMessage('Hello');
await chat.sendMessage('Tell me more'); // Remembers "Hello" context

Guided Workflows

def onboarding_workflow(user_id, step, user_input):
    thread_id = f"onboarding_{user_id}"
    
    # System prompt sets the context for the thread
    system_message = {
        'role': 'system',
        'texts': [{'text': 'You are helping a user through an onboarding process. Current step: ' + step}]
    }
    
    response = requests.post(
        'https://api.freddy.aitronos.com/v1/model/response',
        json={
            'model': 'gpt-4.1',
            'thread': thread_id,
            'inputs': [
                system_message,
                {'role': 'user', 'texts': [{'text': user_input}]}
            ]
        }
    )
    
    return response.json()

Thread Limits

  • Maximum messages per thread: 10,000
  • Message retention: 90 days of inactivity
  • Max thread age: 1 year
  • Metadata size: 16KB per thread

Best Practices

Use Descriptive Thread IDs

// ✅ Good - descriptive and unique
const threadId = `support_${userId}_${ticketId}`;
const threadId = `chat_${sessionId}_${timestamp}`;

// ❌ Bad - too generic or not unique
const threadId = 'thread1';
const threadId = userId; // May conflict across contexts

Add Meaningful Metadata

{
  "thread": {
    "id": "support_user123_ticket789",
    "metadata": {
      "userId": "user_123",
      "ticketId": "ticket_789",
      "category": "billing",
      "priority": "high",
      "assignedTo": "agent_45",
      "created": "2025-01-06T10:30:00Z"
    }
  }
}

Implement Thread Cleanup

Periodically delete old or resolved threads:

def cleanup_old_threads():
    # Get threads older than 90 days
    old_threads = get_threads(older_than=90)
    
    for thread_id in old_threads:
        requests.delete(f'/v1/threads/{thread_id}')

Handle Thread Not Found

async function sendMessage(threadId, message) {
  try {
    const response = await fetch('/v1/model/response', {
      method: 'POST',
      body: JSON.stringify({
        model: 'gpt-4.1',
        thread: threadId,
        inputs: [{ role: 'user', texts: [{ text: message }] }]
      })
    });
    
    if (response.status === 404) {
      // Thread was deleted or expired - create new one
      return await sendMessage(createNewThreadId(), message);
    }
    
    return await response.json();
  } catch (error) {
    console.error('Thread error:', error);
  }
}

Threads vs Manual History

With Threads (Automatic)

// Simple - thread handles history
await sendMessage(threadId, 'Hello');
await sendMessage(threadId, 'How are you?'); // Context automatic

Without Threads (Manual)

// Complex - must track history yourself
let history = [];

async function sendMessage(text) {
  const userMsg = { role: 'user', texts: [{ text }] };
  history.push(userMsg);
  
  const response = await fetch('/v1/model/response', {
    body: JSON.stringify({
      model: 'gpt-4.1',
      inputs: history // Must send full history each time
    })
  });
  
  const result = await response.json();
  history.push(result.output[0]); // Must save assistant response
  return result;
}