Rate Limiting

Overview

The Pump.fun API implements rate limiting to ensure fair usage and maintain service stability. Rate limits restrict the number of requests you can make within a specific time window. Understanding and respecting these limits is crucial for building reliable applications.

Exceeding rate limits results in 429 Too Many Requests errors. Implement proper rate limiting in your application to avoid service disruptions.

Rate Limit Headers

The API includes rate limiting information in response headers. Check these headers to monitor your usage:

x-ratelimit-limit

integer

The maximum number of requests allowed in the current time window.

x-ratelimit-remaining

integer

The number of requests remaining in the current time window.

x-ratelimit-reset

timestamp

The time when the rate limit window resets (Unix timestamp).

Example Response Headers

HTTP/1.1 200 OK
Content-Type: application/json
x-ratelimit-limit: 100
x-ratelimit-remaining: 75
x-ratelimit-reset: 1709876543

Rate Limit Response

When you exceed the rate limit, the API returns a 429 Too Many Requests status code:

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 60
x-ratelimit-limit: 100
x-ratelimit-remaining: 0
x-ratelimit-reset: 1709876543

{
  "error": "Too Many Requests",
  "message": "Rate limit exceeded. Please try again later.",
  "statusCode": 429
}

The Retry-After header indicates how many seconds to wait before making another request.

Checking Rate Limits

Monitor rate limit headers in your application to avoid hitting limits:

import requests
import time

class RateLimitedClient:
    def __init__(self, token):
        self.base_url = "https://frontend-api-v3.pump.fun"
        self.token = token
        self.headers = {
            "Authorization": f"Bearer {token}",
            "Accept": "application/json"
        }
    
    def make_request(self, endpoint):
        url = f"{self.base_url}{endpoint}"
        response = requests.get(url, headers=self.headers)
        
        # Check rate limit headers
        limit = response.headers.get('x-ratelimit-limit')
        remaining = response.headers.get('x-ratelimit-remaining')
        reset_time = response.headers.get('x-ratelimit-reset')
        
        if limit and remaining:
            print(f"Rate limit: {remaining}/{limit} requests remaining")
            
            # Warn if approaching limit
            if int(remaining) < int(limit) * 0.1:  # Less than 10% remaining
                print("Warning: Approaching rate limit!")
        
        # Handle rate limit exceeded
        if response.status_code == 429:
            retry_after = int(response.headers.get('Retry-After', 60))
            print(f"Rate limited. Waiting {retry_after} seconds...")
            time.sleep(retry_after)
            return self.make_request(endpoint)  # Retry
        
        response.raise_for_status()
        return response.json()

# Usage
client = RateLimitedClient('<your_jwt_token>')
data = client.make_request('/coins/{mint}')

Rate Limiting Strategies

Token Bucket Algorithm

Implement a token bucket to control request rates:

import time
import threading

class TokenBucket:
    def __init__(self, rate, capacity):
        self.rate = rate  # Tokens per second
        self.capacity = capacity  # Maximum tokens
        self.tokens = capacity
        self.last_update = time.time()
        self.lock = threading.Lock()
    
    def consume(self, tokens=1):
        with self.lock:
            now = time.time()
            elapsed = now - self.last_update
            
            # Add tokens based on elapsed time
            self.tokens = min(self.capacity, self.tokens + elapsed * self.rate)
            self.last_update = now
            
            if self.tokens >= tokens:
                self.tokens -= tokens
                return True
            return False
    
    def wait_for_token(self):
        while not self.consume():
            time.sleep(0.1)

# Usage: Limit to 10 requests per second
bucket = TokenBucket(rate=10, capacity=10)

for i in range(100):
    bucket.wait_for_token()
    # Make API request
    print(f"Request {i+1}")

Request Queue

Queue requests and process them at a controlled rate:

import time
import queue
import threading

class RateLimitedQueue:
    def __init__(self, requests_per_second):
        self.delay = 1.0 / requests_per_second
        self.queue = queue.Queue()
        self.running = True
        self.worker = threading.Thread(target=self._process_queue)
        self.worker.start()
    
    def _process_queue(self):
        while self.running:
            try:
                func, args, kwargs = self.queue.get(timeout=1)
                func(*args, **kwargs)
                time.sleep(self.delay)
                self.queue.task_done()
            except queue.Empty:
                continue
    
    def enqueue(self, func, *args, **kwargs):
        self.queue.put((func, args, kwargs))
    
    def wait(self):
        self.queue.join()
    
    def stop(self):
        self.running = False
        self.worker.join()

# Usage
def make_api_request(endpoint):
    print(f"Requesting {endpoint}")
    # Actual API call here

rate_limiter = RateLimitedQueue(requests_per_second=10)

for i in range(100):
    rate_limiter.enqueue(make_api_request, f"/endpoint/{i}")

rate_limiter.wait()
rate_limiter.stop()

Best Practices

Monitor rate limit headers

Always check the x-ratelimit-* headers in responses to track your usage and adjust request rates accordingly.

Implement exponential backoff

When you receive a 429 error, wait before retrying. Use exponential backoff (1s, 2s, 4s, 8s) to avoid hammering the API.

Respect Retry-After header

When rate limited, always respect the Retry-After header value. Don’t retry before this time has elapsed.

Batch requests when possible

Some endpoints support batch operations. Use them to reduce the number of individual requests.

Cache responses

Implement caching with ETags to reduce unnecessary requests. See the Caching guide for details.

Distribute load over time

Avoid bursts of requests. Spread requests evenly throughout the rate limit window.

Use webhooks if available

Instead of polling endpoints repeatedly, use webhooks or server-sent events when available to receive updates.

Rate Limit Tiers

Rate limits may vary by endpoint and authentication level. Higher tier accounts or specific endpoints may have different limits.

General Guidelines

Unauthenticated requests: Lower rate limits apply
Authenticated requests: Standard rate limits based on account tier
Admin endpoints: May have separate, stricter rate limits
Read operations: Generally higher limits than write operations
Batch endpoints: May have different limits than single-item endpoints

Handling Rate Limit Errors

When you encounter a 429 error:

Stop sending requests immediately - Don’t continue making requests
Check the Retry-After header - Wait the specified number of seconds
Implement exponential backoff - If no Retry-After header, use exponential backoff
Log the incident - Track rate limit errors to identify patterns
Adjust your application - Reduce request rate to stay within limits

Repeated rate limit violations may result in temporary or permanent API access restrictions. Always respect rate limits and implement proper throttling.

Testing Rate Limits

When testing your rate limiting implementation:

import requests
import time

def test_rate_limiting():
    url = "https://frontend-api-v3.pump.fun/coins/{mint}"
    headers = {"Authorization": "Bearer <token>", "Accept": "application/json"}
    
    requests_made = 0
    start_time = time.time()
    
    while True:
        response = requests.get(url, headers=headers)
        requests_made += 1
        
        if response.status_code == 429:
            elapsed = time.time() - start_time
            print(f"Rate limited after {requests_made} requests in {elapsed:.2f}s")
            print(f"Rate: {requests_made/elapsed:.2f} requests/second")
            break
        
        print(f"Request {requests_made}: {response.status_code}")
        time.sleep(0.1)  # Small delay

test_rate_limiting()

Error Handling - Handle 429 errors properly
Caching - Reduce request volume with caching
Authentication - Rate limits may vary by auth level

Get Started

Essentials

Overview

Rate Limit Headers

Example Response Headers

Rate Limit Response

Checking Rate Limits

Rate Limiting Strategies

Token Bucket Algorithm

Request Queue

Best Practices

Rate Limit Tiers

General Guidelines

Handling Rate Limit Errors

Testing Rate Limits

Build docs developers (and LLMs) love

Get Started

Essentials

​Overview

​Rate Limit Headers

​Example Response Headers

​Rate Limit Response

​Checking Rate Limits

​Rate Limiting Strategies

​Token Bucket Algorithm

​Request Queue

​Best Practices

​Rate Limit Tiers

​General Guidelines

​Handling Rate Limit Errors

​Testing Rate Limits

​Related Guides

Build docs developers (and LLMs) love

Overview

Rate Limit Headers

Example Response Headers

Rate Limit Response

Checking Rate Limits

Rate Limiting Strategies

Token Bucket Algorithm

Request Queue

Best Practices

Rate Limit Tiers

General Guidelines

Handling Rate Limit Errors

Testing Rate Limits

Related Guides