Rate Limits

Overview

Rate limiting is currently in development and will be implemented in a future release. This page documents the planned rate limiting behavior.

Rate limits control the number of API requests you can make within a specific time window. They help ensure fair usage and maintain API performance for all users.

Planned Rate Limits

The following rate limits are planned for implementation:

Standard Tier

rate limit

Requests per minute: 60
Requests per hour: 1,000
Burst allowance: 100 requests

Pro Tier

rate limit

Requests per minute: 300
Requests per hour: 10,000
Burst allowance: 500 requests

Enterprise Tier

rate limit

Requests per minute: Custom
Requests per hour: Custom
Burst allowance: Custom

Rate Limit Headers

When rate limiting is implemented, all API responses will include the following headers:

X-RateLimit-Limit

integer

The maximum number of requests allowed in the current time window

X-RateLimit-Remaining

integer

The number of requests remaining in the current time window

X-RateLimit-Reset

integer

Unix timestamp indicating when the rate limit window resets

Retry-After

integer

Number of seconds to wait before retrying (only included when rate limited)

Example Response Headers

HTTP/1.1 200 OK
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1709856000

Rate Limit Exceeded Response

When you exceed the rate limit, the API will return a 429 Too Many Requests error:

{
  "error": {
    "code": "RATE_LIMIT_EXCEEDED",
    "message": "rate limit exceeded, please retry after 30 seconds",
    "request_id": "req_xyz789"
  }
}

HTTP Status: 429 Too Many Requests Response Headers:

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1709856000
Retry-After: 30

Best Practices

1. Monitor Rate Limit Headers

Always check the rate limit headers in your responses to track your usage:

const response = await fetch('https://api.prompts.dev/v1/prompts', {
  headers: { 'Authorization': `Bearer ${token}` }
});

const remaining = response.headers.get('X-RateLimit-Remaining');
const reset = response.headers.get('X-RateLimit-Reset');

console.log(`Requests remaining: ${remaining}`);
console.log(`Rate limit resets at: ${new Date(reset * 1000)}`);

2. Implement Exponential Backoff

When you receive a 429 error, implement exponential backoff with the Retry-After header:

async function fetchWithRetry(url, options, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    const response = await fetch(url, options);
    
    if (response.status === 429) {
      const retryAfter = response.headers.get('Retry-After');
      const delay = retryAfter ? parseInt(retryAfter) * 1000 : Math.pow(2, i) * 1000;
      
      console.log(`Rate limited. Retrying after ${delay}ms...`);
      await new Promise(resolve => setTimeout(resolve, delay));
      continue;
    }
    
    return response;
  }
  
  throw new Error('Max retries exceeded');
}

3. Cache Responses

Reduce API calls by caching responses locally:

const cache = new Map();

async function getCachedPrompt(promptId, ttl = 60000) {
  const cached = cache.get(promptId);
  
  if (cached && Date.now() - cached.timestamp < ttl) {
    return cached.data;
  }
  
  const response = await fetch(`https://api.prompts.dev/v1/prompts/${promptId}`);
  const data = await response.json();
  
  cache.set(promptId, {
    data,
    timestamp: Date.now()
  });
  
  return data;
}

4. Batch Requests

When possible, use batch endpoints to reduce the number of individual requests:

// Instead of multiple individual requests
for (const id of promptIds) {
  await fetch(`/v1/prompts/${id}`);
}

// Use batch endpoint (when available)
await fetch('/v1/prompts/batch', {
  method: 'POST',
  body: JSON.stringify({ ids: promptIds })
});

5. Implement Request Queuing

Queue requests to stay within rate limits:

class RateLimitedQueue {
  constructor(maxPerMinute) {
    this.queue = [];
    this.maxPerMinute = maxPerMinute;
    this.requestTimes = [];
  }
  
  async add(requestFn) {
    return new Promise((resolve, reject) => {
      this.queue.push({ requestFn, resolve, reject });
      this.process();
    });
  }
  
  async process() {
    if (this.queue.length === 0) return;
    
    const now = Date.now();
    this.requestTimes = this.requestTimes.filter(time => now - time < 60000);
    
    if (this.requestTimes.length >= this.maxPerMinute) {
      const oldestRequest = this.requestTimes[0];
      const delay = 60000 - (now - oldestRequest);
      setTimeout(() => this.process(), delay);
      return;
    }
    
    const { requestFn, resolve, reject } = this.queue.shift();
    this.requestTimes.push(now);
    
    try {
      const result = await requestFn();
      resolve(result);
    } catch (error) {
      reject(error);
    }
    
    if (this.queue.length > 0) {
      setTimeout(() => this.process(), 0);
    }
  }
}

// Usage
const queue = new RateLimitedQueue(60);
await queue.add(() => fetch('/v1/prompts'));

Checking Your Current Usage

Once rate limiting is implemented, you’ll be able to check your current rate limit status without making additional API calls by inspecting the headers from any recent request.

Example: Checking Rate Limit Status

async function checkRateLimitStatus() {
  // Make a lightweight request (like fetching user info)
  const response = await fetch('https://api.prompts.dev/v1/user', {
    headers: { 'Authorization': `Bearer ${token}` }
  });
  
  const limit = parseInt(response.headers.get('X-RateLimit-Limit'));
  const remaining = parseInt(response.headers.get('X-RateLimit-Remaining'));
  const reset = parseInt(response.headers.get('X-RateLimit-Reset'));
  
  const resetDate = new Date(reset * 1000);
  const percentUsed = ((limit - remaining) / limit * 100).toFixed(1);
  
  console.log(`Rate Limit Status:`);
  console.log(`  Total: ${limit} requests`);
  console.log(`  Remaining: ${remaining} requests`);
  console.log(`  Used: ${percentUsed}%`);
  console.log(`  Resets: ${resetDate.toLocaleString()}`);
  
  return { limit, remaining, reset, percentUsed };
}

Need Higher Limits?

If your application requires higher rate limits, please contact our sales team to discuss Enterprise pricing options with custom rate limits tailored to your needs.

Authentication

Prompts

Versions

Reference

Overview

Planned Rate Limits

Rate Limit Headers

Example Response Headers

Rate Limit Exceeded Response

Best Practices

1. Monitor Rate Limit Headers

2. Implement Exponential Backoff

3. Cache Responses

4. Batch Requests

5. Implement Request Queuing

Checking Your Current Usage

Need Higher Limits?

Build docs developers (and LLMs) love

Authentication

Prompts

Versions

Reference

​Overview

​Planned Rate Limits

​Rate Limit Headers

​Example Response Headers

​Rate Limit Exceeded Response

​Best Practices

​1. Monitor Rate Limit Headers

​2. Implement Exponential Backoff

​3. Cache Responses

​4. Batch Requests

​5. Implement Request Queuing

​Checking Your Current Usage

​Need Higher Limits?

Build docs developers (and LLMs) love

Overview

Planned Rate Limits

Rate Limit Headers

Example Response Headers

Rate Limit Exceeded Response

Best Practices

1. Monitor Rate Limit Headers

2. Implement Exponential Backoff

3. Cache Responses

4. Batch Requests

5. Implement Request Queuing

Checking Your Current Usage

Need Higher Limits?