Endpoint
Request Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
model | string | ✅ | — | Model ID (e.g. gpt-4o, claude-3-5-sonnet-20241022) |
messages | array | ✅ | — | Conversation history with role + content |
temperature | number | — | 1 | Creativity: 0 = deterministic, 2 = very creative |
max_tokens | integer | — | model default | Max tokens to generate |
stream | boolean | — | false | Stream partial tokens via SSE |
top_p | number | — | 1 | Nucleus sampling threshold |
frequency_penalty | number | — | 0 | Reduce repetition. Range: -2.0 to 2.0 |
presence_penalty | number | — | 0 | Encourage new topics. Range: -2.0 to 2.0 |
stop | string/array | — | — | Up to 4 stop sequences |
n | integer | — | 1 | Number of completions to return |
user | string | — | — | Your end-user ID for monitoring |
Code Examples
Response Format
Multi-turn Conversations
Maintain context by including the full conversation history:Try It Live
Interactive Playground
Test the chat API directly in your browser with your API key.
Popular Models for Chat
| Model | Best For | Input $/1M | Output $/1M |
|---|---|---|---|
gpt-4o | General purpose, vision | $1.25 | $5.00 |
gpt-4o-mini | Fast, cheap, great quality | $0.075 | $0.30 |
gpt-4.1 | Long context (1M tokens) | $1.00 | $4.00 |
claude-3-5-sonnet-20241022 | Coding, reasoning | $1.50 | $7.50 |
claude-3-5-haiku-20241022 | Fast Anthropic model | $0.40 | $2.00 |
gemini-2.5-flash-preview-05-20 | Fastest Google model | $0.075 | $0.30 |
deepseek-chat | Ultra cheap, smart | $0.007 | $0.014 |
llama-3.3-70b-instruct | Best open-source | $0.05 | $0.16 |
Endpoint
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | ✅ | Model ID (e.g. gpt-4o, claude-3-5-sonnet-20241022) |
messages | array | ✅ | Array of message objects with role and content |
temperature | number | — | Sampling temperature 0–2. Default: 1 |
max_tokens | integer | — | Maximum tokens to generate |
stream | boolean | — | Enable streaming. Default: false |
top_p | number | — | Nucleus sampling. Default: 1 |
frequency_penalty | number | — | Penalize frequent tokens (-2 to 2) |
presence_penalty | number | — | Penalize new topics (-2 to 2) |
stop | string/array | — | Stop sequences |
n | integer | — | Number of completions to generate |
user | string | — | Unique user identifier for abuse monitoring |
Message Roles
| Role | Description |
|---|---|
system | Sets the assistant’s behavior and persona |
user | Messages from the human user |
assistant | Previous assistant responses (for multi-turn) |
Code Examples
Example Response
Multi-turn Conversations
Pass previous messages to maintain context:Popular Models
| Model | Provider | Context | Input $/1M | Output $/1M |
|---|---|---|---|---|
gpt-4o | OpenAI | 128K | $1.25 | $5.00 |
gpt-4o-mini | OpenAI | 128K | $0.075 | $0.30 |
claude-3-5-sonnet-20241022 | Anthropic | 200K | $1.50 | $7.50 |
gemini-2.0-flash | 1M | $0.05 | $0.20 | |
deepseek-chat | DeepSeek | 64K | $0.007 | $0.014 |
llama-3.3-70b-instruct | Meta | 131K | $0.05 | $0.16 |