Freddy uses synapses and neurons as its units for measuring AI model computation. Understanding these helps you predict costs and optimize usage.
Neurons measure input processing — the computational work required to read and understand everything you send to the model:
- Input neurons — Your text messages and prompts
- Context neurons — Thread history included in the request
- System neurons — Instructions and system prompts from your assistant
Every token in your request context consumes neurons.
Synapses measure output generation — the computational work required to produce the model's response:
- Output synapses — The visible text, structured data, or tool calls in the response
- Reasoning synapses — Internal thinking steps for reasoning models (counted even when not shown)
- Tool execution synapses — Processing during function calls
Every token the model generates consumes synapses.
Input and output computation are fundamentally different workloads. Separating them gives you:
- Transparency — See exactly where your costs come from
- Control — Limit output independently from input with
max_output_synapses - Fairness — Pay for the actual computation performed, not a blended average
Tokens (used internally by language models) and synapses/neurons have a direct relationship:
- 1 synapse ≈ 10 output tokens
- 1 neuron ≈ 10 input tokens
These are rounded estimates. The exact conversion may vary by model.
| Metric | Rate |
|---|---|
| Neurons | 8.00 CHF per 1M neurons |
| Synapses | Varies by model |
See Pricing for full model-by-model rates.
Every response includes a usage object with token counts:
{
"usage": {
"input_tokens": 245,
"output_tokens": 87,
"total_tokens": 332
}
}Your usage dashboard in Freddy shows the corresponding synapse and neuron counts.
Use max_output_synapses to cap response length and cost:
{
"organization_id": "org_your_org_id",
"model": "gpt-4o",
"max_output_synapses": 512,
"inputs": [{"role": "user", "content": "Explain quantum computing briefly."}]
}Common values:
| Value | Use Case |
|---|---|
128 | One-sentence answers |
512 | Short paragraphs |
2048 | Standard responses (default) |
4096 | Detailed explanations |
8192 | Long documents or code |
If the model reaches the limit mid-response, the output is truncated.
Track consumption via:
- Freddy → Organization → Usage dashboard
- Analytics API — Programmatic access, see the API Reference
- Pricing — Rates and billing
- Usage Limits Guide — Spending limits and alerts
- Text Generation — Making requests
- Rate Limiting — Request volume limits