# Background Mode

div
strong
🔨 In Development
 — This section is still being developed and may change.


Run model responses asynchronously in the background, allowing your application to continue without waiting for completion.

## Overview

Background mode enables you to submit a model response request and receive an immediate response with a tracking ID, while the actual processing happens asynchronously. This is ideal for:

- Long-running generations
- Batch processing workflows
- Non-blocking UI experiences
- Queue-based architectures


## How It Works

1. **Submit Request**: Send a POST request with `background: true`
2. **Receive ID**: Get an immediate response with a unique response ID
3. **Check Status**: Poll the status endpoint to monitor progress
4. **Retrieve Result**: Fetch the completed response when ready


## Basic Example


```bash
# Submit background request
curl https://api.freddy.aitronos.com/v1/model/response \
  -H "Authorization: Bearer $FREDDY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "background": true,
    "inputs": [
      {
        "role": "user",
        "texts": [{"text": "Write a detailed article about AI"}]
      }
    ]
  }'
```

**Response:**


```json
{
  "id": "resp_67ccd2bed1ec8190",
  "status": "queued",
  "created_at": 1741476542
}
```

## Checking Status

Poll the status endpoint to monitor progress:


```bash
curl https://api.freddy.aitronos.com/v1/model/response/resp_67ccd2bed1ec8190 \
  -H "Authorization: Bearer $FREDDY_API_KEY"
```

**Response (In Progress):**


```json
{
  "id": "resp_67ccd2bed1ec8190",
  "status": "in_progress",
  "created_at": 1741476542
}
```

**Response (Completed):**


```json
{
  "id": "resp_67ccd2bed1ec8190",
  "status": "completed",
  "created_at": 1741476542,
  "completed_at": 1741476550,
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "Here is a detailed article about AI..."
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 12,
    "output_tokens": 450,
    "total_tokens": 462
  }
}
```

## Status Values

| Status | Description |
|  --- | --- |
| `queued` | Request received and queued for processing |
| `in_progress` | Currently being processed |
| `completed` | Successfully completed |
| `failed` | Processing failed (see error details) |
| `cancelled` | Request was cancelled |


## Best Practices

### Polling Interval

Don't poll too frequently. Recommended intervals:

- **Short tasks** (<30s expected): Poll every 2-3 seconds
- **Medium tasks** (30s-5min expected): Poll every 5-10 seconds
- **Long tasks** (>5min expected): Poll every 30-60 seconds


### Timeout Handling

Set reasonable timeouts based on your use case:


```javascript
async function waitForCompletion(responseId, maxWaitTime = 300000) {
  const startTime = Date.now();
  
  while (Date.now() - startTime < maxWaitTime) {
    const status = await checkStatus(responseId);
    
    if (status.status === 'completed') {
      return status;
    }
    
    if (status.status === 'failed') {
      throw new Error('Response failed');
    }
    
    await new Promise(resolve => setTimeout(resolve, 5000)); // Wait 5s
  }
  
  throw new Error('Timeout waiting for response');
}
```

### Webhooks (Coming Soon)

Instead of polling, you can register a webhook URL to receive notifications when processing completes.

## Use Cases

### Batch Processing

Process multiple requests in parallel:


```javascript
const requests = ['prompt1', 'prompt2', 'prompt3'];
const responseIds = [];

// Submit all requests
for (const prompt of requests) {
  const response = await fetch('/v1/model/response', {
    method: 'POST',
    headers: { 'Authorization': `Bearer ${apiKey}` },
    body: JSON.stringify({
      model: 'gpt-4.1',
      background: true,
      inputs: [{ role: 'user', texts: [{ text: prompt }] }]
    })
  });
  const data = await response.json();
  responseIds.push(data.id);
}

// Wait for all to complete
const results = await Promise.all(
  responseIds.map(id => waitForCompletion(id))
);
```

### Non-Blocking UI

Keep your UI responsive while processing:


```javascript
async function generateResponse(prompt) {
  // Submit background request
  const response = await submitBackgroundRequest(prompt);
  
  // Show loading state with ID
  showLoadingMessage(`Processing... (ID: ${response.id})`);
  
  // Poll for completion
  const result = await waitForCompletion(response.id);
  
  // Update UI with result
  displayResult(result);
}
```

### Queue-Based Architecture

Integrate with your job queue:


```python
from celery import Celery
import requests

app = Celery('tasks')

@app.task
def process_ai_response(prompt):
    # Submit background request
    response = requests.post(
        'https://api.freddy.aitronos.com/v1/model/response',
        headers={'Authorization': f'Bearer {api_key}'},
        json={
            'model': 'gpt-4.1',
            'background': True,
            'inputs': [{'role': 'user', 'texts': [{'text': prompt}]}]
        }
    )
    response_id = response.json()['id']
    
    # Store ID for status checking
    return response_id
```

## Limitations

- **Maximum processing time**: 30 minutes
- **Result retention**: Results stored for 24 hours
- **Rate limits**: Same as synchronous requests


## Related Resources

- [Create Model Response](/docs/api-reference/responses/create)
- [Streaming Mode](/docs/documentation/running-methods/streaming-mode)
- [Rate Limiting](/docs/documentation/rate-limiting)