Overview
Vision-capable models can analyze images alongside text. Pass image URLs or base64-encoded images in thecontent field of your messages.
Supported Models
| Model | Provider | Notes |
|---|---|---|
gpt-4o | OpenAI | Best overall vision |
gpt-4o-mini | OpenAI | Fast and affordable |
claude-3-5-sonnet-20241022 | Anthropic | Excellent document understanding |
gemini-2.0-flash | Fast, 1M context | |
gemini-2.5-pro | Best for complex visual reasoning | |
llava-1.6 | Open Source | Open-weight vision model |