Skip to main content
BullMQ provides queue rate limiting capabilities to control how many jobs are processed within a given time period. This is useful for respecting external API limits, protecting downstream services, or managing resource consumption.

Worker-Level Rate Limiting

Configure rate limiting directly on your worker instances:
import { Worker } from 'bullmq';

const worker = new Worker('painter', async job => {
  return await paintCar(job);
}, {
  connection: {
    host: 'localhost',
    port: 6379,
  },
  limiter: {
    max: 10,
    duration: 1000,
  },
});
limiter.max
number
required
Maximum number of jobs to process within the duration period
limiter.duration
number
required
Time window in milliseconds for the rate limit
The rate limiter is global across all workers for the same queue. If you have 10 workers with the settings above, only 10 jobs total will be processed per second across all workers.
Jobs that get rate limited will stay in the waiting state until the rate limit window resets.

Queue-Level Rate Limiting

You can also set rate limits at the queue level using the setGlobalRateLimit method:
import { Queue } from 'bullmq';

const queue = new Queue('painter', {
  connection: {
    host: 'localhost',
    port: 6379,
  },
});

// Limit to 100 jobs per minute
await queue.setGlobalRateLimit(100, 60000);
See Global Rate Limit for more details on queue-level rate limiting.

Manual Rate Limiting

Sometimes you need dynamic rate limiting based on runtime conditions, such as receiving a 429 Too Many Requests response from an API:
import { Worker } from 'bullmq';

const worker = new Worker(
  'myQueue',
  async (job) => {
    try {
      const response = await fetch(job.data.url);
      
      if (response.status === 429) {
        // Get retry-after header (in seconds)
        const retryAfter = response.headers.get('retry-after');
        const delayMs = retryAfter ? parseInt(retryAfter) * 1000 : 5000;
        
        // Apply manual rate limit
        await worker.rateLimit(delayMs);
        
        // Throw special error to move job back to wait
        throw Worker.RateLimitError();
      }
      
      return response.json();
    } catch (error) {
      if (error instanceof Worker.RateLimitError) {
        throw error;
      }
      // Handle other errors
      throw new Error(`Failed to fetch: ${error.message}`);
    }
  },
  {
    connection: {
      host: 'localhost',
      port: 6379,
    },
    limiter: {
      max: 1,
      duration: 500,
    },
  },
);
You must include limiter options in your worker configuration for manual rate limiting to work. The limiter.max value determines if rate limit validation is executed.
When using manual rate limiting, you must throw Worker.RateLimitError() to differentiate rate limiting from actual job failures. This ensures the job returns to the waiting state instead of being marked as failed.

Checking Rate Limit Status

Get Rate Limit TTL

Check if your queue is currently rate limited and when it will reset:
import { Queue } from 'bullmq';

const queue = new Queue('myQueue', { connection });
const maxJobs = 100;

const ttl = await queue.getRateLimitTtl(maxJobs);

if (ttl > 0) {
  console.log(`Queue is rate limited. Resets in ${ttl}ms`);
} else {
  console.log('Queue is not rate limited');
}

Remove Rate Limit

Manually clear the rate limit to allow immediate job processing:
import { Queue } from 'bullmq';

const queue = new Queue('myQueue', { connection });

// Remove the rate limit
await queue.removeRateLimitKey();

console.log('Rate limit removed - workers can pick jobs immediately');
Removing the rate limit key resets the counter to zero, allowing workers to immediately begin processing jobs again.

Practical Examples

Example 1: External API Rate Limits

Many third-party APIs enforce rate limits:
import { Queue, Worker } from 'bullmq';

const githubQueue = new Queue('github-api');

// GitHub API allows 5000 requests per hour
const worker = new Worker('github-api', async job => {
  const response = await fetch(
    `https://api.github.com/users/${job.data.username}`
  );
  return response.json();
}, {
  limiter: {
    max: 5000,
    duration: 3600000, // 1 hour
  },
});

Example 2: Email Service Limits

import { Worker } from 'bullmq';

const emailWorker = new Worker('emails', async job => {
  await sendEmail({
    to: job.data.to,
    subject: job.data.subject,
    body: job.data.body,
  });
}, {
  limiter: {
    max: 100,
    duration: 86400000, // 24 hours (SendGrid free tier)
  },
});

Example 3: Burst Protection

import { Worker } from 'bullmq';

const worker = new Worker('notifications', async job => {
  await sendPushNotification(job.data);
}, {
  limiter: {
    max: 50,
    duration: 10000, // 50 per 10 seconds = 300/minute average
  },
});

Rate Limiting vs Concurrency

Rate limiting controls how many jobs are processed over a time period:
limiter: { max: 100, duration: 60000 } // 100 jobs per minute
Concurrency controls how many jobs run simultaneously:
concurrency: 10 // 10 jobs at the same time
These work together:
import { Worker } from 'bullmq';

const worker = new Worker('tasks', async job => {
  return await processJob(job);
}, {
  concurrency: 5,    // Max 5 jobs running at once
  limiter: {
    max: 100,        // Max 100 jobs per minute
    duration: 60000,
  },
});
See Concurrency and Global Concurrency for more details.

Monitoring Rate Limits

Track rate limit behavior in your application:
import { Queue, QueueEvents } from 'bullmq';

const queue = new Queue('myQueue');
const queueEvents = new QueueEvents('myQueue');

setInterval(async () => {
  const ttl = await queue.getRateLimitTtl(100);
  const waiting = await queue.getWaitingCount();
  
  if (ttl > 0) {
    console.log(`Rate limited. ${waiting} jobs waiting. Resets in ${ttl}ms`);
  }
}, 5000);

queueEvents.on('waiting', ({ jobId }) => {
  console.log(`Job ${jobId} moved to waiting (possibly rate limited)`);
});

Best Practices

1

Match external rate limits

Configure your rate limits to match or be slightly lower than external API limits to avoid 429 errors.
2

Use manual rate limiting for dynamic limits

When APIs return rate limit headers, use manual rate limiting to respect them dynamically.
3

Monitor rate limit status

Regularly check getRateLimitTtl() to understand if your queue is being throttled.
4

Combine with retry strategies

Use rate limiting alongside retry strategies for robust error handling.
5

Consider multiple queues

For different rate limit requirements, use separate queues with different limits.

Global Rate Limit

Queue-level rate limiting across all workers

Concurrency

Control parallel job processing

Global Concurrency

Limit concurrent jobs across all workers

Retrying Failing Jobs

Handle job failures with retries

API Reference

Build docs developers (and LLMs) love