Setup Multi-node Elasticsearch Cluster

|
Last Updated:
|
|

In this guide, we are going to learn how to setup multi-node Elasticsearch cluster. This guide has been tested on Fedora 30/Fedora 29/CentOS 7.

So what is Elasticsearch cluster? An Elasticsearch cluster is a group of nodes that have the same cluster.name attribute. As nodes join or leave a cluster, the cluster automatically reorganizes itself to evenly distribute the data across the available nodes. This ensures increased capacity and reliability.

Each node in an Elasticsearch cluster serves one or more purpose:

  • Master-eligible node – A node that has node.master set to true (default). It is responsible for lightweight cluster-wide actions such as creating or deleting an index, tracking which nodes are part of the cluster, and deciding which shards to allocate to which nodes.
  • Data node – A node that has node.data set to true (default). Data nodes hold data and perform data related operations such as CRUD, search, and aggregations.
  • Ingest node – A node that has node.ingest set to true (default). Ingest nodes are able to apply an ingest pipeline to a document in order to transform and enrich the document before indexing.
  • Coordinating node – Its main role is to route search and indexing requests from clients to data nodes. It behaves as smart load balancers.

See an updated guide on Setup Multinode Elasticsearch 8.x Cluster.

Setup Multi-node Elasticsearch Cluster

In this guide, we are going to set up a three node Elasticsearch cluster with each node being master eligible.

My Environment:

  • Node 1: es-node-01.kifarunix-demo.com
  • Node 2: es-node-02.kifarunix-demo.com
  • Node 3: es-node-03.kifarunix-demo.com

Ensure that the hostnames are resolvable on each node. If you do not have a DNS server, then you can use your hosts file.

192.168.43.103 es-node-01.kifarunix-demo.com es-node-01
192.168.43.15 es-node-02.kifarunix-demo.com es-node-02
192.168.43.62 es-node-03.kifarunix-demo.com es-node-03

Installing Elasticsearch 7 on Fedora 30/29/CentOS 7

We have covered the installation of Elasticsearch on Fedora 30/29/CentOS 7 in our previous guides. Follow the link below to install Elasticsearch.

Install Elasticsearch 7 on Fedora 30

Install Elasticsearch 7.x on CentOS 7/Fedora 29

Setup Multi-node Elasticsearch Cluster

Once the installation of Elasticsearch on the three nodes is done, proceed to configure Elasticsearch cluster.

Set Elasticsearch Cluster name

On each node, open the Elasticsearch configuration file and set the name of your Elasticsearch cluster.

vim /etc/elasticsearch/elasticsearch.yml
...
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: es-nodes
...

Set Descriptive names for Elasticsearch Nodes

Node 1

...
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: es-node-01
...

Node 2

# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: es-node-02
...

Node 3

...
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: es-node-03
...

Disable Memory Swapping

Swapping affects the stability of Elasticsearch cluster as it can cause nodes to respond slowly or even to disconnect from the cluster. Once of the ways of disabling memory swapping is by enabling memory lock. Hence, uncomment the line bootstrap.memory_lock: true.

...
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
bootstrap.memory_lock: true
...

I recommend that you disable swapping using systemd by editing Elasticsearch service and adding the content below;

systemctl edit elasticsearch
[Service]
LimitMEMLOCK=infinity

Whenever a systemd service is modified, you need to reload the systemd configurations.

sudo systemctl daemon-reload

One of the recommended ways to disable swapping is to completely disable swap. This is fine if Elasticsearch is the only service running on the server.

swapoff -a

Define the Roles of each Elasticsearch Node

As stated above, you can assign each node a respective role as master, data node, ingest node, coordinating node.. In this setup, we will configure all the three nodes to act as both master and data node.

...
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: es-nodes

node.master: true
node.data: true
...

Bind Elasticsearch to Non-loopback Address

Elasticsearch binds to loopback addresses by default. For a node to form a cluster, you need to bind it to non-loopback address. This can be done by setting the IP address of the node as the value of network.host.

Node 1

network.host: 192.168.43.103

Node 2

network.host: 192.168.43.15

Node 3

network.host: 192.168.43.62

Elasticsearch by default uses TCP port 9200 to expose REST APIs. TCP port 9300-9400 is used for node communication

Discovery and cluster formation settings

There are two important discovery and cluster formation settings that should be configured before going to production so that nodes in the cluster can discover each other and elect a master node;

  • discovery.seed_hosts and cluster.initial_master_nodes.

discovery.seed_hosts setting Provides a list of master-eligible nodes in the cluster. Each value has the format host:port or host, where port defaults to the setting transport.profiles.default.port. This setting was previously known as discovery.zen.ping.unicast.hosts. Configure this setting on all Nodes as follows;

...
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
discovery.seed_hosts: ["192.168.43.103", "192.168.43.15", "192.168.43.62"]
...

cluster.initial_master_nodes setting defines the initial set of master-eligible nodes. This is important when starting an Elasticsearch cluster for the very first time. This setting is ignored once the cluster is formed.

...
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
#cluster.initial_master_nodes: ["node-1", "node-2"]
cluster.initial_master_nodes: ["es-node-01", "es-node-02", "es-node-03"]
...

Elasticsearch 7 doesn’t use the Zen Discovery coordination method used by the previous versions and thus, the discovery.zen.minimum_master_nodes setting has been phased out so that Elasticsearch itself can choose which nodes can form a quorum.

Set JVM Heap Size

Elasticsearch sets the heap size to 1GB by default. As a rule of thump, set Xmx to no more than 50% of your physical RAM but not more than 32GB.

vim /etc/elasticsearch/jvm.options
...
################################################################

# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space

-Xms1g
-Xmx1g
...

Configure Other Important Systems Settings

Set maximum Open File Descriptor

Set the maximum number of open files for the elasticsearch user to 65,536. This is already set by default in the, /usr/lib/systemd/system/elasticsearch.service.

...
# Specifies the maximum file descriptor number that can be opened by this process
LimitNOFILE=65535
...

You also should set the maximum number of processes.

...
# Specifies the maximum number of processes
LimitNPROC=4096

Virtual Memory Settings

Elasticsearch uses a mmapfs directory by default to store its indices. To ensure that you do not run out of virtual memory, edit the /etc/sysctl.conf and update the value of vm.max_map_count as shown below.

vm.max_map_count=262144

You can simply run the command below to configure virtual memory settings.

echo "vm.max_map_count=262144" >> /etc/sysctl.conf

Reboot the system to apply the changes then run the command, sysctl vm.max_map_count, to verify the configurations.

sysctl vm.max_map_count
vm.max_map_count = 262144

Open Elasticsearch Ports on FirewallD

As mentioned above, Elasticsearch exposes HTTP APIs over TCP port 9200 and allows nodes to communicate over TCP port 9300-9400. You need to open these ports on firewalld.

firewall-cmd --add-port={9200,9300-9400}/tcp --permanent
firewall-cmd --reload

Running Elasticsearch

Reload the systemd manager configuration.

systemctl daemon-reload

Enable Elasticsearch to run on system boot.

systemctl enable elasticsearch

Start Elasticsearch

systemctl start elasticsearch

You can check the Elasticsearch status running the command below;

systemctl status elasticsearch.service
● elasticsearch.service - Elasticsearch
   Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/elasticsearch.service.d
           └─override.conf
   Active: active (running) since Thu 2019-07-11 23:03:21 EAT; 3min 5s ago
     Docs: http://www.elastic.co
 Main PID: 1539 (java)
    Tasks: 39 (limit: 3477)
   Memory: 1.3G
   CGroup: /system.slice/elasticsearch.service
           ├─1539 /usr/share/elasticsearch/jdk/bin/java -Xms1g -Xmx1g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiat>
           └─1627 /usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/bin/controller

To check for any errors, you can tail the Elasticsearch cluster log file.

tail /var/log/elasticsearch/<cluster-name>.log

Check Elasticsearch Cluster Health

curl -X GET "192.168.43.103:9200/_cluster/health?pretty"
{
  "cluster_name" : "es-nodes",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 3,
  "active_primary_shards" : 0,
  "active_shards" : 0,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

The cluster health status is: greenyellow or red.  red status indicates that the specific shard is not allocated in the cluster, yellow means that the primary shard is allocated but replicas are not, and green means that all shards are allocated.

Check the Cluster Nodes

curl -X GET "192.168.43.103:9200/_cat/nodes?v"
ip             heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
192.168.43.62            11          79   5    0.00    0.02     0.04 mdi       -      es-node-03
192.168.43.103            8          66   3    0.13    0.11     0.09 mdi       -      es-node-01
192.168.43.15             9          79   5    0.00    0.01     0.03 mdi       *      es-node-02

You Elasticsearch cluster is now working fine.

After the cluster forms successfully for the first time, remove the cluster.initial_master_nodes setting from each nodes’ configuration. Do not use this setting when restarting a cluster or adding a new node to an existing cluster.

To remove this setting, run the command below;

sed -i 's/^cluster.initial_master_nodes:/#&/' /etc/elasticsearch/elasticsearch.yml

Next, restart Elasticsearch on each node.

systemctl restart elasticsearch

That is all on how to setup multi-node Elasticsearch 7.x cluster on Fedora 30/Fedora 29/CentOS 7.

Install and Configure Elastic Auditbeat on Ubuntu 18.04

Install Elastic Stack 7 on Fedora 30/Fedora 29/CentOS 7

Install Elastic Stack 7 on Ubuntu 18.04/Debian 9.8

SUPPORT US VIA A VIRTUAL CUP OF COFFEE

We're passionate about sharing our knowledge and experiences with you through our blog. If you appreciate our efforts, consider buying us a virtual coffee. Your support keeps us motivated and enables us to continually improve, ensuring that we can provide you with the best content possible. Thank you for being a coffee-fueled champion of our work!

Photo of author
koromicha
I am the Co-founder of Kifarunix.com, Linux and the whole FOSS enthusiast, Linux System Admin and a Blue Teamer who loves to share technological tips and hacks with others as a way of sharing knowledge as: "In vain have you acquired knowledge if you have not imparted it to others".

2 thoughts on “Setup Multi-node Elasticsearch Cluster”

Leave a Comment