cluster.name setting; this is how nodes discover each other and form a cluster.
A cluster can have a single node (useful for local development) or hundreds of nodes distributed across multiple availability zones. Elasticsearch handles data distribution, replication, and request routing automatically regardless of cluster size.
Node roles
A node is a single running instance of Elasticsearch. Each node can be assigned one or more roles that define what work it performs. Roles are set inelasticsearch.yml under the node.roles key.
Master-eligible
Master-eligible
A master-eligible node can be elected as the active master. The elected master is responsible for managing cluster-wide state: creating and deleting indices, tracking which nodes belong to the cluster, and deciding where shards are allocated.In production, run at least three dedicated master-eligible nodes to ensure a quorum can always be formed. Master nodes should be given modest resources — they do not hold data and are not on the hot path for search or indexing.
Data
Data
Data nodes hold shards and execute data-related operations: indexing documents, running searches, executing aggregations. They are the workhorses of the cluster and typically require the most CPU, memory, and disk.Elasticsearch also supports tiered data roles (
data_hot, data_warm, data_cold, data_frozen) for implementing Index Lifecycle Management (ILM) across storage tiers.Ingest
Ingest
Ingest nodes run ingest pipelines — sequences of processors that transform documents before they are indexed. Common processors include For high-volume pipelines, dedicate separate ingest nodes so that pipeline processing does not compete with indexing or search on data nodes.
grok, set, remove, date, and script.Coordinating-only
Coordinating-only
A coordinating-only node has no roles assigned (or only the implicit coordinating role). It acts as a smart load balancer: it receives client requests, fans them out to the appropriate data nodes, and merges the results before returning the response to the client.Coordinating nodes are useful when you have heavy aggregation workloads or large result sets that require significant memory to merge. They shield data nodes from the memory pressure of result merging.
Remote cluster client
Remote cluster client
A node with the
remote_cluster_client role can connect to remote clusters for cross-cluster search (CCS) and cross-cluster replication (CCR).Machine learning
Machine learning
ML nodes run machine learning jobs, such as anomaly detection and inference tasks. They require significant memory and CPU and should be isolated from data nodes in production deployments.
In development or small deployments, a single node typically runs all roles simultaneously. This is fine for testing, but dedicated role assignments are strongly recommended in production for isolation and resource management.
Master election
At any given time, exactly one master-eligible node is elected as the active master. The election uses a quorum-based consensus algorithm (Raft-based in Elasticsearch 7.0+). A quorum requires a majority of master-eligible nodes to agree, so you need at least 3 master-eligible nodes to tolerate the loss of one.Nodes discover each other
On startup, each master-eligible node tries to discover the others using the seed hosts configured in
discovery.seed_hosts. In cloud environments, discovery plugins handle this automatically.Quorum is formed
Once enough master-eligible nodes have found each other (at least
(N/2)+1 nodes for a cluster of N master-eligible nodes), an election is triggered.Active master is elected
The node with the highest term number that has an up-to-date cluster state wins the election and becomes the active master. All other master-eligible nodes become followers.
Request routing
Every node in the cluster knows the cluster topology — which shards live on which nodes. This means a client can send a request to any node, and that node will act as the coordinating node for the request. For a search request, the coordinating node:- Determines which shards hold data relevant to the query (based on the index name and optional routing value).
- Forwards the query phase to one copy of each relevant shard (primary or replica, chosen by round-robin).
- Collects the top-N document IDs and scores from each shard.
- Fetches the full
_sourcefor the top documents in the fetch phase. - Merges the results and returns the final response to the client.
Cluster health
Elasticsearch reports cluster health at three levels: green, yellow, and red. You can check health with:Green
All primary and replica shards are assigned and active. The cluster is fully operational.
Yellow
All primary shards are assigned, but one or more replica shards are unassigned. Data is fully accessible but redundancy is reduced. Common when running a single-node cluster with replicas configured.
Red
One or more primary shards are unassigned. Some data may be unavailable. Search requests against affected indices will fail. Immediate attention required.
Further reading
Node settings
Configure node roles, heap size, thread pools, and network settings for each node type.
Performance tuning
Optimize indexing throughput, search latency, and memory usage for production workloads.
