Why Backpressure Matters
Gitaly sits at the bottom of the GitLab stack for Git data access. All Git operations flow through Gitaly, making it vulnerable to:- Traffic surges that overwhelm system resources
- Large repository operations that block other requests
- Resource exhaustion from concurrent operations
Concurrency Queue
Limit the number of concurrent RPCs in flight per repository using the[[concurrency]] configuration.
Basic Configuration
How It Works
Withmax_per_repo = 1:
- A clone request arrives for repository “A” (a large repository)
- While the first request executes, a second request for repository “A” arrives
- The second request blocks and waits in a queue
- When the first request completes, the second request proceeds
Queue Management
To prevent unbounded memory usage from queued requests, configure queue limits:Parameters
Maximum number of concurrent requests per repository for this RPC
Maximum time a request can wait in the queue. Requests exceeding this time receive an error.
Maximum number of requests that can wait in the queue. Additional requests are rejected immediately.
Rate Limiting
Rate limiting restricts how frequently operations can run per repository using a token bucket algorithm.Configuration
Token Bucket Algorithm
The rate limiter uses a token bucket for each RPC per repository:- The bucket has a capacity of
bursttokens - The bucket refills at the specified
interval - Each request consumes one token from the bucket
- When the bucket is empty, requests are rejected until it refills
Example
With the configuration above:- The token bucket has a capacity of 1
- It refills every minute
- Gitaly accepts only 1
RepackFullrequest per repository per minute - Additional requests within that minute are rejected
Parameters
The fully qualified RPC method name to rate limit
How often the token bucket refills
Token bucket capacity (number of requests allowed per interval)
Error Handling
When limits are exceeded, Gitaly returns a structured gRPCgitalypb.LimitError containing:
- Message - Human-readable error description
- BackoffDuration - Suggested wait time before retrying (0 means don’t retry)
- Users cloning via HTTP or SSH
- The GitLab web application
- API consumers
Monitoring
Prometheus metrics provide visibility into backpressure events:- Requests waiting in concurrency queues
- Queue wait times
- Rate limit rejections
- Backoff durations
Video Tutorial
Watch How to configure backpressure in Gitaly for a detailed walkthrough of configuration options.Best Practices
Start with conservative limits
Start with conservative limits
Begin with stricter limits and gradually relax them based on monitoring data. It’s easier to loosen restrictions than to recover from resource exhaustion.
Monitor queue metrics
Monitor queue metrics
Track queue wait times and sizes to identify operations that need tuning. Consistently full queues indicate capacity issues.
Configure per-operation limits
Configure per-operation limits
Different RPCs have different resource profiles. Expensive operations like RepackFull need stricter limits than lightweight operations.
Test backpressure behavior
Test backpressure behavior
Verify that clients handle
LimitError responses gracefully before deploying strict limits to production.