Skip to main content

Rate Limits

Scribe Backend currently has soft rate limits to ensure fair usage and system stability. This page documents current limits and best practices for high-volume usage.

Current Rate Limits

Scribe Backend does not enforce hard rate limits at the infrastructure level. However, operational limits exist to protect system resources.

Batch Operations

Maximum batch size
integer
default:"100"
Maximum number of items per POST /api/queue/batch request
Example:
curl -X POST https://scribeapi.manitmishra.com/api/queue/batch \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "items": [
      {"recipient_name": "Dr. Jane Smith", "recipient_interest": "ML"},
      // ... up to 100 items total
    ],
    "email_template": "Hi {{name}}!"
  }'
Requests exceeding 100 items will return 400 Bad Request with error message: “Maximum 100 items per batch”

Celery Worker Concurrency

Sequential Processing: The Celery worker runs with concurrency=1 to:
  • Prevent API rate limits on external services (Exa, ArXiv, Anthropic)
  • Optimize memory usage on resource-constrained hardware (Raspberry Pi)
  • Ensure FIFO queue processing
Impact: Queue items are processed one at a time in order of submission.

External API Limits

Scribe Backend interacts with external APIs that have their own rate limits:
ServiceRate LimitImpact
Anthropic ClaudeTier-dependentLLM calls in template_parser and email_composer
Exa Search1000 requests/month (free tier)Web scraping in web_scraper step
ArXiv API1 request/3 secondsAcademic paper fetching in arxiv_helper
The pipeline implements automatic retry logic with exponential backoff for external API rate limits.

Rate Limit Responses

429 Too Many Requests

When rate limits are exceeded, the API returns:
{
  "detail": "Maximum 100 items per batch"
}
Retry Strategy:
const response = await fetch('/api/queue/batch', options)

if (response.status === 429) {
  const retryAfter = response.headers.get('Retry-After') || 60
  await new Promise(resolve => setTimeout(resolve, retryAfter * 1000))
  // Retry the request
}

Best Practices

Batch Submissions

If you need to process more than 100 emails:
const recipients = [...] // 250 recipients
const chunkSize = 100

for (let i = 0; i < recipients.length; i += chunkSize) {
  const chunk = recipients.slice(i, i + chunkSize)
  
  await fetch('/api/queue/batch', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${token}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      items: chunk,
      email_template: template
    })
  })

  // Optional: Add delay between chunks
  await new Promise(resolve => setTimeout(resolve, 1000))
}
Check queue backlog before submitting additional batches:
const queueStatus = await fetch('/api/queue/', {
  headers: { 'Authorization': `Bearer ${token}` }
}).then(r => r.json())

const pendingCount = queueStatus.filter(item => item.status === 'PENDING').length

if (pendingCount < 50) {
  // Submit next batch
} else {
  // Wait for queue to drain
}
Implement exponential backoff for 429 and 5xx errors:
async function submitWithBackoff(url, options, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    const response = await fetch(url, options)

    if (response.ok) return response

    if (response.status === 429 || response.status >= 500) {
      const delay = Math.pow(2, i) * 1000 // 1s, 2s, 4s
      await new Promise(resolve => setTimeout(resolve, delay))
      continue
    }

    // Don't retry client errors
    return response
  }
  throw new Error('Max retries exceeded')
}

Polling

Recommended intervals:
  • Queue status: Poll every 2 seconds while items are pending/processing
  • Task status: Poll every 2 seconds for first 30 seconds, then 5 seconds
  • Stop polling after task completes or expires (1 hour)
async function pollTaskStatus(taskId, token) {
  let attempts = 0
  
  while (attempts < 180) { // 6 minutes max
    const response = await fetch(`/api/email/status/${taskId}`, {
      headers: { 'Authorization': `Bearer ${token}` }
    })
    const data = await response.json()

    if (data.status === 'SUCCESS' || data.status === 'FAILURE') {
      return data
    }

    // Adaptive polling: 2s for first 15 attempts, then 5s
    const delay = attempts < 15 ? 2000 : 5000
    await new Promise(resolve => setTimeout(resolve, delay))
    attempts++
  }

  throw new Error('Task timeout')
}
Always terminate polling loops when tasks complete:
const pollInterval = setInterval(async () => {
  const status = await fetch('/api/queue/', {
    headers: { 'Authorization': `Bearer ${token}` }
  }).then(r => r.json())

  const hasActive = status.some(item => 
    item.status === 'PENDING' || item.status === 'PROCESSING'
  )

  if (!hasActive) {
    clearInterval(pollInterval)
    console.log('All items processed')
  }
}, 2000)

Optimizing for High Volume

Cache email templates

Store frequently-used templates in your database instead of regenerating

Batch by template

Group recipients by template to minimize context switching

Monitor queue depth

Track queue length and adjust submission rate accordingly

Use webhooks (future)

Avoid polling by implementing webhook notifications (planned feature)

Processing Time Estimates

Typical pipeline execution times per email:
Template TypeAverage TimeNotes
GENERAL~8-9 secondsFastest (skips ArXiv step)
BOOK~9-10 secondsSkips ArXiv step
RESEARCH~10-12 secondsIncludes ArXiv paper fetching
Batch Processing Time:
# For a batch of 100 emails (RESEARCH type):
processing_time = 100 emails × 11 seconds/email = 1100 seconds (~18 minutes)
Processing time varies based on:
  • Template complexity
  • Web scraping content length
  • External API response times
  • LLM generation speed

Future Rate Limit Plans

The following features are planned for future releases:

Tiered Rate Limits

  • Free Tier: 100 emails/day, 10 concurrent queue items
  • Pro Tier: 1000 emails/day, 50 concurrent queue items
  • Enterprise Tier: Unlimited emails, dedicated worker

Rate Limit Headers

Future API responses will include rate limit headers:
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1640995200

Webhook Notifications

Eliminate polling with webhook support:
{
  "event": "email.generated",
  "task_id": "abc-123",
  "email_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "SUCCESS",
  "timestamp": "2025-01-13T10:30:00Z"
}

Monitoring Rate Limits

Track your usage with these queries:

Daily Email Count

curl -X GET "https://scribeapi.manitmishra.com/api/user/profile" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"
Response:
{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "email": "[email protected]",
  "generation_count": 42,
  "created_at": "2025-01-01T00:00:00Z"
}

Active Queue Items

curl -X GET "https://scribeapi.manitmishra.com/api/queue/" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"
Count PENDING and PROCESSING items to gauge current load.

Queue System

Learn about the database-backed queue architecture

Real-Time Updates

Optimize polling strategies for queue status

Build docs developers (and LLMs) love