Skip to main content

Rate Limits by Plan

PlanOpen-Source RPDClosed-Source RPDPro Models RPD
Free150700
Starter2,0001,0000
Pro4,5002,500650
EnterpriseUnlimitedUnlimitedUnlimited
RPD = Requests Per Day

Model Classes

ClassExamples
Open-SourceLlama, Mistral, Mixtral, DeepSeek, Qwen
Closed-SourceGPT-4o, Claude, Gemini, Grok
ProSora, Veo, DALL-E 3 HD, gpt-4.5

Rate Limit Headers

Every API response includes headers showing your current usage:
X-RateLimit-Limit-Requests: 2000
X-RateLimit-Remaining-Requests: 1847
X-RateLimit-Reset-Requests: 2024-01-01T00:00:00Z

Handling Rate Limit Errors

When you hit a rate limit, the API returns HTTP 429:
{
  "error": {
    "type": "rate_limit_exceeded",
    "message": "Rate limit exceeded. You have used 2000/2000 requests today.",
    "code": 429
  }
}

Retry with Exponential Backoff

import time
import random
from openai import OpenAI, RateLimitError

client = OpenAI(
    api_key="sk-samurai-YOUR_KEY",
    base_url="https://api.samuraiapi.in/v1"
)

def chat_with_retry(messages, max_retries=5):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                model="gpt-4o",
                messages=messages
            )
        except RateLimitError as e:
            if attempt == max_retries - 1:
                raise
            wait = (2 ** attempt) + random.random()  # Exponential backoff
            print(f"Rate limited. Retrying in {wait:.1f}s...")
            time.sleep(wait)
Rate limits reset at midnight UTC. Contact support for enterprise rate limit increases.