Skip to main content

Overview

The Pump.fun API implements rate limiting to ensure fair usage and maintain service stability. Rate limits restrict the number of requests you can make within a specific time window. Understanding and respecting these limits is crucial for building reliable applications.
Exceeding rate limits results in 429 Too Many Requests errors. Implement proper rate limiting in your application to avoid service disruptions.

Rate Limit Headers

The API includes rate limiting information in response headers. Check these headers to monitor your usage:
x-ratelimit-limit
integer
The maximum number of requests allowed in the current time window.
x-ratelimit-remaining
integer
The number of requests remaining in the current time window.
x-ratelimit-reset
timestamp
The time when the rate limit window resets (Unix timestamp).

Example Response Headers

HTTP/1.1 200 OK
Content-Type: application/json
x-ratelimit-limit: 100
x-ratelimit-remaining: 75
x-ratelimit-reset: 1709876543

Rate Limit Response

When you exceed the rate limit, the API returns a 429 Too Many Requests status code:
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 60
x-ratelimit-limit: 100
x-ratelimit-remaining: 0
x-ratelimit-reset: 1709876543

{
  "error": "Too Many Requests",
  "message": "Rate limit exceeded. Please try again later.",
  "statusCode": 429
}
The Retry-After header indicates how many seconds to wait before making another request.

Checking Rate Limits

Monitor rate limit headers in your application to avoid hitting limits:
import requests
import time

class RateLimitedClient:
    def __init__(self, token):
        self.base_url = "https://frontend-api-v3.pump.fun"
        self.token = token
        self.headers = {
            "Authorization": f"Bearer {token}",
            "Accept": "application/json"
        }
    
    def make_request(self, endpoint):
        url = f"{self.base_url}{endpoint}"
        response = requests.get(url, headers=self.headers)
        
        # Check rate limit headers
        limit = response.headers.get('x-ratelimit-limit')
        remaining = response.headers.get('x-ratelimit-remaining')
        reset_time = response.headers.get('x-ratelimit-reset')
        
        if limit and remaining:
            print(f"Rate limit: {remaining}/{limit} requests remaining")
            
            # Warn if approaching limit
            if int(remaining) < int(limit) * 0.1:  # Less than 10% remaining
                print("Warning: Approaching rate limit!")
        
        # Handle rate limit exceeded
        if response.status_code == 429:
            retry_after = int(response.headers.get('Retry-After', 60))
            print(f"Rate limited. Waiting {retry_after} seconds...")
            time.sleep(retry_after)
            return self.make_request(endpoint)  # Retry
        
        response.raise_for_status()
        return response.json()

# Usage
client = RateLimitedClient('<your_jwt_token>')
data = client.make_request('/coins/{mint}')

Rate Limiting Strategies

Token Bucket Algorithm

Implement a token bucket to control request rates:
import time
import threading

class TokenBucket:
    def __init__(self, rate, capacity):
        self.rate = rate  # Tokens per second
        self.capacity = capacity  # Maximum tokens
        self.tokens = capacity
        self.last_update = time.time()
        self.lock = threading.Lock()
    
    def consume(self, tokens=1):
        with self.lock:
            now = time.time()
            elapsed = now - self.last_update
            
            # Add tokens based on elapsed time
            self.tokens = min(self.capacity, self.tokens + elapsed * self.rate)
            self.last_update = now
            
            if self.tokens >= tokens:
                self.tokens -= tokens
                return True
            return False
    
    def wait_for_token(self):
        while not self.consume():
            time.sleep(0.1)

# Usage: Limit to 10 requests per second
bucket = TokenBucket(rate=10, capacity=10)

for i in range(100):
    bucket.wait_for_token()
    # Make API request
    print(f"Request {i+1}")

Request Queue

Queue requests and process them at a controlled rate:
import time
import queue
import threading

class RateLimitedQueue:
    def __init__(self, requests_per_second):
        self.delay = 1.0 / requests_per_second
        self.queue = queue.Queue()
        self.running = True
        self.worker = threading.Thread(target=self._process_queue)
        self.worker.start()
    
    def _process_queue(self):
        while self.running:
            try:
                func, args, kwargs = self.queue.get(timeout=1)
                func(*args, **kwargs)
                time.sleep(self.delay)
                self.queue.task_done()
            except queue.Empty:
                continue
    
    def enqueue(self, func, *args, **kwargs):
        self.queue.put((func, args, kwargs))
    
    def wait(self):
        self.queue.join()
    
    def stop(self):
        self.running = False
        self.worker.join()

# Usage
def make_api_request(endpoint):
    print(f"Requesting {endpoint}")
    # Actual API call here

rate_limiter = RateLimitedQueue(requests_per_second=10)

for i in range(100):
    rate_limiter.enqueue(make_api_request, f"/endpoint/{i}")

rate_limiter.wait()
rate_limiter.stop()

Best Practices

Always check the x-ratelimit-* headers in responses to track your usage and adjust request rates accordingly.
When you receive a 429 error, wait before retrying. Use exponential backoff (1s, 2s, 4s, 8s) to avoid hammering the API.
When rate limited, always respect the Retry-After header value. Don’t retry before this time has elapsed.
Some endpoints support batch operations. Use them to reduce the number of individual requests.
Implement caching with ETags to reduce unnecessary requests. See the Caching guide for details.
Avoid bursts of requests. Spread requests evenly throughout the rate limit window.
Instead of polling endpoints repeatedly, use webhooks or server-sent events when available to receive updates.

Rate Limit Tiers

Rate limits may vary by endpoint and authentication level. Higher tier accounts or specific endpoints may have different limits.

General Guidelines

  • Unauthenticated requests: Lower rate limits apply
  • Authenticated requests: Standard rate limits based on account tier
  • Admin endpoints: May have separate, stricter rate limits
  • Read operations: Generally higher limits than write operations
  • Batch endpoints: May have different limits than single-item endpoints

Handling Rate Limit Errors

When you encounter a 429 error:
  1. Stop sending requests immediately - Don’t continue making requests
  2. Check the Retry-After header - Wait the specified number of seconds
  3. Implement exponential backoff - If no Retry-After header, use exponential backoff
  4. Log the incident - Track rate limit errors to identify patterns
  5. Adjust your application - Reduce request rate to stay within limits
Repeated rate limit violations may result in temporary or permanent API access restrictions. Always respect rate limits and implement proper throttling.

Testing Rate Limits

When testing your rate limiting implementation:
import requests
import time

def test_rate_limiting():
    url = "https://frontend-api-v3.pump.fun/coins/{mint}"
    headers = {"Authorization": "Bearer <token>", "Accept": "application/json"}
    
    requests_made = 0
    start_time = time.time()
    
    while True:
        response = requests.get(url, headers=headers)
        requests_made += 1
        
        if response.status_code == 429:
            elapsed = time.time() - start_time
            print(f"Rate limited after {requests_made} requests in {elapsed:.2f}s")
            print(f"Rate: {requests_made/elapsed:.2f} requests/second")
            break
        
        print(f"Request {requests_made}: {response.status_code}")
        time.sleep(0.1)  # Small delay

test_rate_limiting()

Build docs developers (and LLMs) love