Rate Limits - Samurai AI

Rate Limits by Plan

Plan	Open-Source RPD	Closed-Source RPD	Pro Models RPD
Free	150	70	0
Starter	2,000	1,000	0
Pro	4,500	2,500	650
Enterprise	Unlimited	Unlimited	Unlimited

RPD = Requests Per Day

Model Classes

Class	Examples
Open-Source	Llama, Mistral, Mixtral, DeepSeek, Qwen
Closed-Source	GPT-4o, Claude, Gemini, Grok
Pro	Sora, Veo, DALL-E 3 HD, gpt-4.5

Rate Limit Headers

Every API response includes headers showing your current usage:

X-RateLimit-Limit-Requests: 2000
X-RateLimit-Remaining-Requests: 1847
X-RateLimit-Reset-Requests: 2024-01-01T00:00:00Z

Handling Rate Limit Errors

When you hit a rate limit, the API returns HTTP 429:

{
  "error": {
    "type": "rate_limit_exceeded",
    "message": "Rate limit exceeded. You have used 2000/2000 requests today.",
    "code": 429
  }
}

Retry with Exponential Backoff

import time
import random
from openai import OpenAI, RateLimitError

client = OpenAI(
    api_key="sk-samurai-YOUR_KEY",
    base_url="https://api.samuraiapi.in/v1"
)

def chat_with_retry(messages, max_retries=5):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                model="gpt-4o",
                messages=messages
            )
        except RateLimitError as e:
            if attempt == max_retries - 1:
                raise
            wait = (2 ** attempt) + random.random()  # Exponential backoff
            print(f"Rate limited. Retrying in {wait:.1f}s...")
            time.sleep(wait)

Rate limits reset at midnight UTC. Contact support for enterprise rate limit increases.

Pricing Error Codes

Resources

Documentation Index

​Rate Limits by Plan

​Model Classes

​Rate Limit Headers

​Handling Rate Limit Errors

​Retry with Exponential Backoff

Rate Limits by Plan

Model Classes

Rate Limit Headers

Handling Rate Limit Errors

Retry with Exponential Backoff