Indexing performance
Bulk indexing
Always use the Bulk API for high-throughput indexing. A single bulk request per client thread amortizes the per-request overhead across many documents. Start with batches of 5–15 MB (uncompressed) and tune from there. Larger batches do not always produce better throughput and increase GC pressure.Refresh interval
Elasticsearch makes newly indexed documents searchable by performing a refresh, which is an expensive operation. The default interval is1s.
During a large bulk load, disable refreshes entirely, then re-enable them when the load completes:
Run your bulk indexing job
Send documents using the Bulk API. With refreshes disabled, segment merges are deferred and write throughput increases significantly.
Replica count during bulk load
Replicas double (or more) the indexing work because each shard copy must receive and index every document. Set replicas to0 during a large initial load, then increase when the load is complete:
Search performance
Caches
Elasticsearch maintains several caches that improve repeated query performance.Node query cache (filter cache)
Node query cache (filter cache)
Caches the results of filter clauses (queries in a
Filters are cached automatically when Elasticsearch determines the query is used often enough. You cannot manually pin items in the cache.
filter context) at the segment level. Shared across all shards on a node.Configured per node in elasticsearch.yml:| Setting | Default | Description |
|---|---|---|
indices.queries.cache.size | 10% | Size of the node-level query cache as a percentage of JVM heap, or an absolute byte value. |
index.queries.cache.enabled | true | Per-index setting to enable or disable the query cache. |
Shard request cache
Shard request cache
Field data cache
Field data cache
Holds uninverted field values in memory for use during aggregations on
Prefer
text fields, sorting, and some scripting operations. Field data is loaded lazily on first use and is expensive to build.| Setting | Default | Description |
|---|---|---|
indices.fielddata.cache.size | Unbounded | Maximum heap fraction or byte size for the field data cache. Recommended to set an explicit limit (e.g. 40%). |
keyword fields with doc_values (the default) for aggregations. doc_values are stored on disk and do not consume heap in the field data cache.Shard sizing
Shard count and size are the most common source of performance problems in Elasticsearch.Target 10–50 GB per shard
Shards smaller than 10 GB create overhead: more metadata, more threads, more inter-node coordination. Shards larger than 50 GB slow recovery and rebalancing.
Limit shard count per node
Each shard consumes JVM heap for metadata. A common guideline is to keep shard count below 20 shards per GB of heap. On an 8 GB heap node, keep shards under ~160.
index.number_of_shards on future indices to reduce total shard count.
Thread pools
Elasticsearch uses dedicated thread pools for different operation types. You can see the current state with:| Pool | Purpose | Default size |
|---|---|---|
write | Bulk, index, delete, and update requests | Number of available processors |
search | Search and aggregation requests | int((# of available processors * 3) / 2) + 1 |
analyze | Analyze API requests | 1 |
elasticsearch.yml, but they rarely need changing. The defaults are well-tuned for most hardware.
Increasing
queue_size delays rejection errors at the cost of higher memory pressure during traffic spikes. Increasing the thread count beyond the number of CPU cores leads to context-switch overhead that degrades throughput rather than improving it.Circuit breakers
Circuit breakers prevent JVM out-of-memory errors by rejecting requests that would exceed configured memory limits. When a circuit breaker trips, Elasticsearch returns an HTTP429 or 503 error rather than crashing.
Field data circuit breaker
Field data circuit breaker
Limits the total amount of heap used by the field data cache.
| Setting | Default | Description |
|---|---|---|
indices.breaker.fielddata.limit | 40% | Maximum heap fraction for field data. Requests that would exceed this trigger a CircuitBreakingException. |
indices.breaker.fielddata.overhead | 1.03 | A multiplier applied to field data size estimates before checking against the limit. |
Request circuit breaker
Request circuit breaker
Limits the memory used by a single request, including in-memory aggregation data structures.
| Setting | Default | Description |
|---|---|---|
indices.breaker.request.limit | 60% | Maximum heap fraction for a single request’s in-memory structures. |
indices.breaker.request.overhead | 1 | Multiplier applied to request memory estimates. |
In-flight requests circuit breaker
In-flight requests circuit breaker
Limits the total memory consumed by all currently in-flight requests, including transport and HTTP layer request bodies.
| Setting | Default | Description |
|---|---|---|
network.breaker.inflight_requests.limit | 100% | Maximum heap fraction for in-flight request byte sizes. |
network.breaker.inflight_requests.overhead | 2 | Multiplier applied to in-flight request size estimates. |
Parent circuit breaker
Parent circuit breaker
An overall cap that all other circuit breakers count against. Protects against multiple breakers individually staying within their limits while collectively exhausting the heap.
| Setting | Default | Description |
|---|---|---|
indices.breaker.total.limit | 70% (or 95% with real memory tracking) | Maximum combined heap fraction for all circuit breakers. |
indices.breaker.total.use_real_memory | true | When true, the parent breaker accounts for actual JVM memory usage rather than estimates. More accurate but slightly more CPU-intensive. |
Slow logs
Slow logs record queries and indexing operations that exceed configurable time thresholds. They are the primary diagnostic tool for identifying expensive operations.Search slow log
Set thresholds per index. Requests exceeding the threshold are written to the slow log at the corresponding level.Indexing slow log
*_index_indexing_slowlog.json and *_index_search_slowlog.json) alongside the main Elasticsearch logs.
