Backpressure Configuration

Gitaly provides backpressure mechanisms to prevent resource exhaustion when handling traffic surges. Instead of accepting more requests than it can handle, Gitaly can push back on clients through concurrency limits and rate limits.

Why Backpressure Matters

Gitaly sits at the bottom of the GitLab stack for Git data access. All Git operations flow through Gitaly, making it vulnerable to:

Traffic surges that overwhelm system resources
Large repository operations that block other requests
Resource exhaustion from concurrent operations

Backpressure allows Gitaly to gracefully reject requests it cannot handle, rather than degrading service for all users.

Concurrency Queue

Limit the number of concurrent RPCs in flight per repository using the [[concurrency]] configuration.

Basic Configuration

[[concurrency]]
rpc = "/gitaly.SmartHTTPService/PostUploadPackWithSidechannel"
max_per_repo = 1

How It Works

With max_per_repo = 1:

A clone request arrives for repository “A” (a large repository)
While the first request executes, a second request for repository “A” arrives
The second request blocks and waits in a queue
When the first request completes, the second request proceeds

Queue Management

To prevent unbounded memory usage from queued requests, configure queue limits:

[[concurrency]]
rpc = "/gitaly.SmartHTTPService/PostUploadPackWithSidechannel"
max_per_repo = 1
max_queue_wait = "1m"
max_queue_size = 5

Parameters

max_per_repo

integer

Maximum number of concurrent requests per repository for this RPC

max_queue_wait

duration

Maximum time a request can wait in the queue. Requests exceeding this time receive an error.

max_queue_size

integer

Maximum number of requests that can wait in the queue. Additional requests are rejected immediately.

Rate Limiting

Rate limiting restricts how frequently operations can run per repository using a token bucket algorithm.

Configuration

[[rate_limiting]]
rpc = "/gitaly.RepositoryService/RepackFull"
interval = "1m"
burst = 1

Token Bucket Algorithm

The rate limiter uses a token bucket for each RPC per repository:

The bucket has a capacity of burst tokens
The bucket refills at the specified interval
Each request consumes one token from the bucket
When the bucket is empty, requests are rejected until it refills

Example

With the configuration above:

The token bucket has a capacity of 1
It refills every minute
Gitaly accepts only 1 RepackFull request per repository per minute
Additional requests within that minute are rejected

Parameters

rpc

string

required

The fully qualified RPC method name to rate limit

interval

duration

required

How often the token bucket refills

burst

integer

required

Token bucket capacity (number of requests allowed per interval)

Error Handling

When limits are exceeded, Gitaly returns a structured gRPC gitalypb.LimitError containing:

Message - Human-readable error description
BackoffDuration - Suggested wait time before retrying (0 means don’t retry)

Gitaly clients (gitlab-shell, workhorse, Rails) parse these errors and display appropriate messages to:

Users cloning via HTTP or SSH
The GitLab web application
API consumers

Monitoring

Prometheus metrics provide visibility into backpressure events:

Requests waiting in concurrency queues
Queue wait times
Rate limit rejections
Backoff durations

See the monitoring documentation for details on accessing these metrics. For comprehensive monitoring details, refer to the GitLab Documentation.

Video Tutorial

Watch How to configure backpressure in Gitaly for a detailed walkthrough of configuration options.

Best Practices

Start with conservative limits

Begin with stricter limits and gradually relax them based on monitoring data. It’s easier to loosen restrictions than to recover from resource exhaustion.

Monitor queue metrics

Track queue wait times and sizes to identify operations that need tuning. Consistently full queues indicate capacity issues.

Configure per-operation limits

Different RPCs have different resource profiles. Expensive operations like RepackFull need stricter limits than lightweight operations.

Test backpressure behavior

Verify that clients handle LimitError responses gracefully before deploying strict limits to production.

Overview

Setup & Configuration

High Availability

Operations

Advanced Topics

Development

Backpressure Configuration

Why Backpressure Matters

Concurrency Queue

Basic Configuration

How It Works

Queue Management

Parameters

Rate Limiting

Configuration

Token Bucket Algorithm

Example

Parameters

Error Handling

Monitoring

Video Tutorial

Best Practices

Build docs developers (and LLMs) love

Overview

Setup & Configuration

High Availability

Operations

Advanced Topics

Development

​Why Backpressure Matters

​Concurrency Queue

​Basic Configuration

​How It Works

​Queue Management

​Parameters

​Rate Limiting

​Configuration

​Token Bucket Algorithm

​Example

​Parameters

​Error Handling

​Monitoring

​Video Tutorial

​Best Practices

Build docs developers (and LLMs) love

Why Backpressure Matters

Concurrency Queue

Basic Configuration

How It Works

Queue Management

Parameters

Rate Limiting

Configuration

Token Bucket Algorithm

Example

Parameters

Error Handling

Monitoring

Video Tutorial

Best Practices