Skip to main content

Overview

Yellowstone gRPC performance depends on several factors including Tokio runtime configuration, channel capacities, compression settings, and network parameters. This guide covers all available tuning options.

Tokio Runtime Configuration

Worker Threads

Configure the number of Tokio worker threads based on your CPU cores and workload:
{
  "tokio": {
    "worker_threads": 8
  }
}
worker_threads
integer
Number of worker threads in the Tokio runtime. Defaults to the number of CPU cores if not specified.
Recommendations:
  • Low traffic: 2-4 threads
  • Medium traffic: 4-8 threads
  • High traffic: 8-16 threads
  • Very high traffic: 16+ threads (ensure you have sufficient CPU cores)

CPU Affinity

Pin Tokio threads to specific CPU cores for better cache locality and performance:
{
  "tokio": {
    "worker_threads": 4,
    "affinity": "0-1,12-13"
  }
}
affinity
string
CPU core affinity specification. Use ranges (0-3) or individual cores (0,2,4) separated by commas.
Use cases:
  • NUMA systems: Pin to cores on the same NUMA node as your network interface
  • Hyper-threading: Use physical cores (e.g., “0-3,8-11”) or separate logical cores
  • Shared systems: Reserve specific cores for the plugin

Channel Capacity Tuning

Per-Connection Channel Capacity

{
  "grpc": {
    "channel_capacity": "100_000"
  }
}
channel_capacity
integer
default:"250000"
Capacity of the channel per gRPC connection. Increase for high-throughput clients.
Recommendations:
  • Default: 250,000 (works for most cases)
  • High-frequency traders: 500,000 - 1,000,000
  • Low-resource environments: 50,000 - 100,000
Higher channel capacity increases memory usage. Monitor memory consumption when increasing this value.

Snapshot Channel Capacity

For snapshot operations (initial account loading):
{
  "grpc": {
    "snapshot_plugin_channel_capacity": null,
    "snapshot_client_channel_capacity": "50_000_000"
  }
}
snapshot_plugin_channel_capacity
integer
default:"null"
Capacity of the channel used for accounts from snapshot. When set, blocks validator startup if limit is reached.
snapshot_client_channel_capacity
integer
default:"50000000"
Capacity of the client channel for snapshot data.

Compression Settings

Configure compression algorithms for optimal bandwidth/CPU tradeoff:
{
  "grpc": {
    "compression": {
      "accept": ["gzip", "zstd"],
      "send": ["gzip", "zstd"]
    }
  }
}
compression.accept
array
default:"[\"gzip\", \"zstd\"]"
Compression algorithms accepted from clients.
compression.send
array
default:"[\"gzip\", \"zstd\"]"
Compression algorithms used when sending data to clients.

Compression Algorithm Comparison

AlgorithmCompression RatioCPU UsageLatencyBest For
None1.0xMinimalLowestLocal networks, low CPU
gzip3-4xMediumMediumBalanced performance
zstd3-5xLow-MediumLowHigh throughput
Recommendations:
  • Low bandwidth: Use ["zstd", "gzip"] for maximum compression
  • Low latency: Disable compression or use zstd only
  • Balanced: Default ["gzip", "zstd"] works well

HTTP/2 Tuning

Adaptive Window

Enable HTTP/2 adaptive flow control windows:
{
  "grpc": {
    "server_http2_adaptive_window": true
  }
}
server_http2_adaptive_window
boolean
default:"null"
Enable adaptive HTTP/2 flow control windows for better throughput on high-latency connections.

Connection Window Sizes

{
  "grpc": {
    "server_initial_connection_window_size": 1048576,
    "server_initial_stream_window_size": 1048576
  }
}
server_initial_connection_window_size
integer
default:"65535"
Initial HTTP/2 connection window size in bytes. Increase for high-bandwidth connections.
server_initial_stream_window_size
integer
default:"65535"
Initial HTTP/2 stream window size in bytes.
Recommendations:
  • Default: 65,535 bytes (HTTP/2 default)
  • High bandwidth: 1,048,576 bytes (1 MiB)
  • Very high bandwidth: 4,194,304 bytes (4 MiB)

Keepalive Settings

Preventing connection timeouts through load balancers:
{
  "grpc": {
    "server_http2_keepalive_interval": "30s",
    "server_http2_keepalive_timeout": "10s"
  }
}
server_http2_keepalive_interval
duration
default:"null"
Interval between HTTP/2 keepalive pings.
server_http2_keepalive_timeout
duration
default:"null"
Timeout waiting for keepalive ping acknowledgment.
Most cloud load balancers (Cloudflare, AWS ALB, etc.) close idle connections after 30-120 seconds. Set keepalive_interval to 30-60 seconds to prevent disconnections.

Message Size Limits

{
  "grpc": {
    "max_decoding_message_size": "4_194_304"
  }
}
max_decoding_message_size
integer
default:"4194304"
Maximum size of decoded messages in bytes (4 MiB default).
When to increase:
  • Large account data subscriptions
  • Full block subscriptions with many transactions
  • Historical data replay

Concurrency Limits

Unary Request Concurrency

{
  "grpc": {
    "unary_concurrency_limit": 100
  }
}
unary_concurrency_limit
integer
default:"unlimited"
Maximum concurrent unary RPC requests (GetLatestBlockhash, GetBlockHeight, etc.).

Subscription Limits

{
  "grpc": {
    "subscription_limit": 1000,
    "subscription_limit_enforce": false
  }
}
subscription_limit
integer
default:"1000"
Maximum concurrent subscriptions per subscriber ID (based on x-subscription-id header or remote IP).
subscription_limit_enforce
boolean
default:"false"
When true, reject subscriptions exceeding the limit with RESOURCE_EXHAUSTED. When false, only log and emit metrics.

Encoding Performance

Encoder Threads

{
  "grpc": {
    "encoder_threads": 4
  }
}
encoder_threads
integer
default:"4"
Number of threads for parallel message encoding.
Recommendations:
  • Low traffic: 2 threads
  • Medium traffic: 4 threads (default)
  • High traffic: 8-16 threads

Filter Name Caching

{
  "grpc": {
    "filter_name_size_limit": 128,
    "filter_names_size_limit": 4096,
    "filter_names_cleanup_interval": "1s"
  }
}
filter_name_size_limit
integer
default:"128"
Maximum size of individual filter names in bytes.
filter_names_size_limit
integer
default:"4096"
Number of cached filter names before cleanup.
filter_names_cleanup_interval
duration
default:"1s"
Interval for filter name cache cleanup once size limit is reached.

Replay Configuration

{
  "grpc": {
    "replay_stored_slots": 0
  }
}
replay_stored_slots
integer
default:"0"
Number of recent slots to store for replay/retransmission. Set to 0 to disable.
Use cases:
  • Enable (100-1000 slots): When clients need to catch up on missed data
  • Disable (0): For real-time only streams to save memory
Replay functionality increases memory usage proportional to the number of stored slots and data volume.

Traffic Reporting

{
  "grpc": {
    "traffic_reporting_byte_threhsold": "64KiB"
  }
}
traffic_reporting_byte_threhsold
string
default:"64KiB"
Byte threshold for updating traffic metrics in Prometheus.
Lower values = more accurate metrics, higher overhead
Higher values = less overhead, coarser metrics

Performance Monitoring

Track these Prometheus metrics to validate tuning:
# Message throughput
rate(grpc_message_sent_count[1m])

# Bandwidth usage
rate(grpc_bytes_sent[1m])

# Queue backlog (should stay low)
message_queue_size

# Cache efficiency
rate(yellowstone_grpc_pre_encoded_cache_hit[5m]) / 
  (rate(yellowstone_grpc_pre_encoded_cache_hit[5m]) + 
   rate(yellowstone_grpc_pre_encoded_cache_miss[5m]))

Example Configurations

High-Throughput Configuration

For maximum throughput with sufficient resources:
{
  "tokio": {
    "worker_threads": 16,
    "affinity": "0-7,16-23"
  },
  "grpc": {
    "channel_capacity": "500_000",
    "compression": {
      "accept": ["zstd"],
      "send": ["zstd"]
    },
    "encoder_threads": 8,
    "server_http2_adaptive_window": true,
    "server_initial_connection_window_size": 4194304,
    "server_initial_stream_window_size": 4194304
  }
}

Low-Latency Configuration

For minimum latency:
{
  "tokio": {
    "worker_threads": 8,
    "affinity": "0-7"
  },
  "grpc": {
    "channel_capacity": "100_000",
    "compression": {
      "accept": [],
      "send": []
    },
    "encoder_threads": 4
  }
}

Resource-Constrained Configuration

For limited CPU/memory environments:
{
  "tokio": {
    "worker_threads": 2
  },
  "grpc": {
    "channel_capacity": "50_000",
    "compression": {
      "accept": ["gzip"],
      "send": ["gzip"]
    },
    "encoder_threads": 2,
    "unary_concurrency_limit": 50,
    "subscription_limit": 100,
    "subscription_limit_enforce": true
  }
}

Build docs developers (and LLMs) love