Rate limiting protects your chatbot from abuse and controls costs by limiting how many messages users can send. This guide covers the chatbot’s multi-layer rate limiting implementation.
Rate limiting architecture
The chatbot implements three layers of rate limiting:
IP-based rate limiting - Prevents abuse from anonymous users
User-based rate limiting - Enforces message quotas per user type
API gateway limits - Provider-level rate limits (external)
IP-based rate limiting
IP rate limiting uses Redis to track requests by IP address:
import { createClient } from "redis" ;
import { isProductionEnvironment } from "@/lib/constants" ;
import { ChatbotError } from "@/lib/errors" ;
const MAX_MESSAGES_PER_DAY = 10 ;
const TTL_SECONDS = 60 * 60 * 24 ;
let client : ReturnType < typeof createClient > | null = null ;
function getClient () {
if ( ! client && process . env . REDIS_URL ) {
client = createClient ({ url: process . env . REDIS_URL });
client . on ( "error" , () => {});
client . connect (). catch (() => {
client = null ;
});
}
return client ;
}
export async function checkIpRateLimit ( ip : string | undefined ) {
if ( ! isProductionEnvironment || ! ip ) return ;
const redis = getClient ();
if ( ! redis ?. isReady ) return ;
try {
const key = `ip-rate-limit: ${ ip } ` ;
const [ count ] = await redis
. multi ()
. incr ( key )
. expire ( key , TTL_SECONDS , "NX" )
. exec ();
if ( typeof count === "number" && count > MAX_MESSAGES_PER_DAY ) {
throw new ChatbotError ( "rate_limit:chat" );
}
} catch ( error ) {
if ( error instanceof ChatbotError ) throw error ;
}
}
IP rate limiting only runs in production (isProductionEnvironment) and gracefully degrades if Redis is unavailable.
How IP rate limiting works
Extract IP address
Get the client IP from the request: app/(chat)/api/chat/route.ts
import { ipAddress } from "@vercel/functions" ;
await checkIpRateLimit ( ipAddress ( request ));
Increment counter
Use Redis MULTI to atomically increment and set expiry: const key = `ip-rate-limit: ${ ip } ` ;
const [ count ] = await redis
. multi ()
. incr ( key )
. expire ( key , TTL_SECONDS , "NX" )
. exec ();
The "NX" flag ensures expiry is only set if it doesn’t exist, preserving the original TTL.
Check threshold
Throw a rate limit error if the threshold is exceeded: if ( typeof count === "number" && count > MAX_MESSAGES_PER_DAY ) {
throw new ChatbotError ( "rate_limit:chat" );
}
User-based rate limiting
Authenticated users have message quotas based on their user type:
import type { UserType } from "@/app/(auth)/auth" ;
type Entitlements = {
maxMessagesPerDay : number ;
};
export const entitlementsByUserType : Record < UserType , Entitlements > = {
guest: {
maxMessagesPerDay: 10 ,
},
regular: {
maxMessagesPerDay: 10 ,
},
};
Implementing user rate limits
app/(chat)/api/chat/route.ts
import { entitlementsByUserType } from "@/lib/ai/entitlements" ;
import { getMessageCountByUserId } from "@/lib/db/queries" ;
const userType : UserType = session . user . type ;
const messageCount = await getMessageCountByUserId ({
id: session . user . id ,
differenceInHours: 24 ,
});
if ( messageCount > entitlementsByUserType [ userType ]. maxMessagesPerDay ) {
return new ChatbotError ( "rate_limit:chat" ). toResponse ();
}
This queries the database to count messages from the user in the last 24 hours:
export async function getMessageCountByUserId ({
id ,
differenceInHours ,
} : {
id : string ;
differenceInHours : number ;
}) {
try {
const twentyFourHoursAgo = new Date (
Date . now () - differenceInHours * 60 * 60 * 1000
);
const [ stats ] = await db
. select ({ count: count ( message . id ) })
. from ( message )
. innerJoin ( chat , eq ( message . chatId , chat . id ))
. where (
and (
eq ( chat . userId , id ),
gte ( message . createdAt , twentyFourHoursAgo ),
eq ( message . role , "user" )
)
)
. execute ();
return stats ?. count ?? 0 ;
} catch ( _error ) {
throw new ChatbotError (
"bad_request:database" ,
"Failed to get message count by user id"
);
}
}
Database-based rate limiting is less performant than Redis but doesn’t require additional infrastructure. Consider moving to Redis for high-traffic applications.
Configuring rate limits
Modify the constants in lib/ratelimit.ts: const MAX_MESSAGES_PER_DAY = 10 ; // Increase for more generous limits
const TTL_SECONDS = 60 * 60 * 24 ; // Change window size
Add new user types with different quotas: export const entitlementsByUserType : Record < UserType , Entitlements > = {
guest: {
maxMessagesPerDay: 10 ,
},
regular: {
maxMessagesPerDay: 10 ,
},
premium: {
maxMessagesPerDay: 100 ,
},
enterprise: {
maxMessagesPerDay: Infinity ,
},
};
You can implement different limits for different models: const modelLimits : Record < string , number > = {
"gpt-4" : 5 ,
"gpt-4o-mini" : 10 ,
"claude-4.5-sonnet" : 10 ,
};
const modelMessageCount = await getMessageCountByUserIdAndModel ({
id: session . user . id ,
model: selectedChatModel ,
differenceInHours: 24 ,
});
if ( modelMessageCount > modelLimits [ selectedChatModel ]) {
return new ChatbotError ( "rate_limit:chat" ). toResponse ();
}
Error handling
Rate limit errors use the centralized error system:
export class ChatbotError extends Error {
type : ErrorType ;
surface : Surface ;
statusCode : number ;
constructor ( errorCode : ErrorCode , cause ?: string ) {
super ();
const [ type , surface ] = errorCode . split ( ":" );
this . type = type as ErrorType;
this . surface = surface as Surface ;
this . message = getMessageByErrorCode ( errorCode );
this . statusCode = getStatusCodeByType ( this . type );
}
toResponse () {
const code : ErrorCode = ` ${ this . type } : ${ this . surface } ` ;
return Response . json (
{ code , message: this . message },
{ status: this . statusCode }
);
}
}
The rate limit error returns a 429 status code:
case "rate_limit:chat" :
return "You have exceeded your maximum number of messages for the day. Please try again later." ;
Redis setup
To enable IP-based rate limiting, configure Redis:
Deploy Redis
Use Vercel KV, Upstash, or any Redis provider: # Using Vercel KV
vercel env pull
Set environment variable
Add REDIS_URL to your environment: REDIS_URL = redis://default:password@host:port
Test connection
The rate limiter will automatically connect when REDIS_URL is present: function getClient () {
if ( ! client && process . env . REDIS_URL ) {
client = createClient ({ url: process . env . REDIS_URL });
client . on ( "error" , () => {});
client . connect (). catch (() => {
client = null ;
});
}
return client ;
}
If Redis is unavailable, the rate limiter gracefully degrades and allows requests through. This prevents Redis outages from breaking your chatbot.
Monitoring rate limits
Track rate limit hits and adjust limits accordingly:
export async function checkIpRateLimit ( ip : string | undefined ) {
// ...
try {
const key = `ip-rate-limit: ${ ip } ` ;
const [ count ] = await redis
. multi ()
. incr ( key )
. expire ( key , TTL_SECONDS , "NX" )
. exec ();
if ( typeof count === "number" && count > MAX_MESSAGES_PER_DAY ) {
console . log ( `Rate limit exceeded for IP: ${ ip } , count: ${ count } ` );
throw new ChatbotError ( "rate_limit:chat" );
}
} catch ( error ) {
if ( error instanceof ChatbotError ) throw error ;
}
}
Consider integrating with observability tools like:
Vercel Analytics for error tracking
DataDog for Redis metrics
Sentry for error reporting
Bot detection
The chatbot uses BotID for additional bot protection:
app/(chat)/api/chat/route.ts
import { checkBotId } from "botid/server" ;
const [ botResult , session ] = await Promise . all ([ checkBotId (), auth ()]);
if ( botResult . isBot ) {
return new ChatbotError ( "unauthorized:chat" ). toResponse ();
}
This runs in parallel with authentication for better performance.
Advanced patterns
Sliding window rate limiting
Implement more sophisticated rate limiting with sliding windows: async function checkSlidingWindow ( userId : string ) {
const now = Date . now ();
const windowStart = now - 24 * 60 * 60 * 1000 ;
const key = `rate-limit: ${ userId } ` ;
await redis
. multi ()
. zremrangebyscore ( key , 0 , windowStart )
. zadd ( key , now , ` ${ now } ` )
. expire ( key , TTL_SECONDS )
. exec ();
const count = await redis . zcard ( key );
if ( count > MAX_MESSAGES_PER_DAY ) {
throw new ChatbotError ( "rate_limit:chat" );
}
}
Implement burst allowance with token buckets: async function checkTokenBucket ( userId : string ) {
const key = `bucket: ${ userId } ` ;
const refillRate = 10 / ( 24 * 60 * 60 ); // 10 per day
const bucketSize = 5 ; // Allow bursts of 5
const data = await redis . get ( key );
const { tokens , lastRefill } = data
? JSON . parse ( data )
: { tokens: bucketSize , lastRefill: Date . now () };
const now = Date . now ();
const timePassed = ( now - lastRefill ) / 1000 ;
const newTokens = Math . min (
bucketSize ,
tokens + timePassed * refillRate
);
if ( newTokens < 1 ) {
throw new ChatbotError ( "rate_limit:chat" );
}
await redis . set (
key ,
JSON . stringify ({
tokens: newTokens - 1 ,
lastRefill: now ,
}),
{ ex: TTL_SECONDS }
);
}
Apply different limits based on user location: import { geolocation } from "@vercel/functions" ;
const { country } = geolocation ( request );
const limitsByCountry : Record < string , number > = {
US: 20 ,
GB: 20 ,
default: 10 ,
};
const limit = limitsByCountry [ country ] ?? limitsByCountry . default ;
if ( messageCount > limit ) {
throw new ChatbotError ( "rate_limit:chat" );
}
Testing rate limits
Test rate limiting in development:
// Temporarily lower limits for testing
const MAX_MESSAGES_PER_DAY = 2 ; // Instead of 10
// Or disable for development
if ( process . env . NODE_ENV === "development" ) {
return ; // Skip rate limiting
}
Never deploy with rate limiting disabled. Always test with realistic limits in a staging environment.
Next steps