Setup/configure a three node Elasticsearch cluster on CentOS 7

Updated post: This article was published on December 2019. This is the normal way of installing Elasticsearch on Linux machines. You can follow this blog for setting up a three node Elasticsearch cluster on CentOS 8 as well. Instead of YUM you can use DNF.

Agenda: Setup a three node Elasticsearch cluster on CentOS / RHEL 7. Setup a three node Elasticsearch cluster on CentOS / RHEL 8. Step by step procedure to install Elasticsearch cluster on CentOS / RHEL. Elasticsearch basics. Prerequisites for setting up Elasticsearch cluster on CentOS / RHEL.

IMP: This blog explain the manual process for setting up a three node Elastic cluster. Highly recommend to use this Ansible role https://github.com/elastic/ansible-elasticsearch

Elasticsearch is a widely using Search Engine and it’s other use cases are log analytics, full-text search, security intelligence, business analytics etc. It’s open source, you can set it up as a cluster on your own servers. In this article, we will discuss about the basics of Elasticsearch and it’s use cases. How to setup a three node Elasticsearch cluster on CentOS servers.

Little bit history

Shay Banon is the founder of Elasticsearch. The first version of Elasticsearch was released on 2010 February. Here I am adding few words from Wiki…

While thinking about the third version of Compass he realized that it would be necessary to rewrite big parts of Compass to "create a scalable search solution". So he created "a solution built from the ground up to be distributed" and used a common interface, JSON over HTTP, suitable for programming languages other than Java as well.[6] Shay Banon released the first version of Elasticsearch in February 2010.

Since its release in 2010, Elasticsearch has quickly become the most popular search engine.

What is Elasticsearch?

Elasticsearch is an open-source, RESTful, distributed search and analytics engine built on Apache Lucene. We can use Elasticsearch in many areas to improve the performance of your infra. Apart from Search Engine, It’s a good option in analytics area. It’s a core component in RELK stack. To analyse the logs and metrics you can use the Elasticsearch cluster as the data store.

We are not discussing these thing in detail in this article. In this blog article, I will explain the steps to setup / configure a three node Elasticsearch cluster in CentOS / RHEL.

Prerequisites

1, Three CentOS / RHEL servers for setting up the Elasticsearch cluster. Elasticsearch cluster should have a minimum of 3 master-eligible nodes. 

2, If possible attach a separate disk for data storage. Highly recommended. Attach a separate disk and configure LVM for future expansion.

3, Memory: Use a minimum 2 GB, the more heap available to Elasticsearch, the more memory it can use for its internal caches, but the less memory it leaves available for the operating system to use for the filesystem cache. Refer this official documentation: Setting the heap size

4, Don’t expose the Elasticsearch process to Public. Make sure you have a private network for inter node communication. For a cluster setup, nodes need to communicate each other.

5, Enable port 9200 and 9300 on all nodes for other nodes in the cluster.

6, Java: Install Java on all the servers.

That’s it. You’re all set to start setting up the three node Elasticsearch cluster.

Steps to setup three node Elasticsearch cluster on CentOS

Step 1: Install Java

As I mentioned in prerequisites, Elasticsearch needs Java, so we need to install Java first. To install Java on CentOS, please execute the following command:

yum install java-1.8.0-openjdk 

Execute “java -version” and make sure the Java is installed correctly.

Step 2: Download the Elasticsearch RPM

curl -L -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.7.2.rpm 

You can download the latest version from here >> Download Elasticsearch << In this page you can see all the packages, RPM, DEB etc…

Step 3: Install using RPM

rpm -i elasticsearch-6.7.2.rpm 

Step 4: Start / Enable service

systemctl daemon-reload
systemctl enable elasticsearch.service
systemctl start elasticsearch.service

Installation part is done. Once you installed it on all three servers, you can start editing the configuration to setup the cluster using these three nodes.

The Elasticsearch configuration file is located here: /etc/elasticsearch/elasticsearch.yml

Before making changes in the configuration make sure that the port 9200 and 9300 are open between the nodes in the cluster. Add firewall rules accordingly. Try telnet / nc and make sure that the connections are okay between nodes..

Step 5: Set minimum memory for JVM

By default, the minimum memory set for JVM is 2gb, if your server has small memory size, change this value in /etc/elasticsearch/jvm.options

-Xms2g
-Xmx2g

Change the value to a minimum based on the memory available on your servers. Examples, -Xms512m or -Xms1g etc

Step 6: Create a Data Directory for Elasticsearch (optional)

It’s better to attach a separate disk for Elasticsearch Data. If you have enough space on your primary disk, you can go ahead with that one. Just create a new directory and set relevant permissions to that directory.

mkdir /var/lib/elasticsearch/data
chown -R elasticsearch:elasticsearch /var/lib/elasticsearch/data
chmod -R 775 /var/lib/elasticsearch/data

Step 7: Set Data Directory

We already create a Directory for saving Elasticsearch Data, set that in configuration file.

path.data: /var/lib/elasticsearch/data 

Step 8: Configure Elasticsearch cluster

As I mentioned, we have to make changes on this configuration file /etc/elasticsearch/elasticsearch.yml You have to make the following changes in configuration file to setup a cluster.

8.1: Stop Elasticsearch, if it’s running.

systemctl stop elasticsearch.service 

8.2: On all nodes, setup the cluster name:

cluster.name: es-crybit 

Open the configuration file on all the three servers and set the same name as cluster name.

8.3: Set node name for all nodes

node.name: es1 

8.4: Bind an IP for Elasticsearch

By default, the Elasticsearch process listen on 0.0.0.0 we need to assign the private IP.

network.host 10.10.10.10 

8.5: Set discovery by specifying all Nodes IP addresses (Add it on all nodes)

discovery.zen.ping.unicast.hosts: ["10.22.28.112", "10.22.28.113", "10.22.28.114"] 

8.6: Specify the number of Master eligible nodes (Add it on all nodes)

discovery.zen.minimum_master_nodes: 2 

8.7: Define Data & Master nodes

node.master: true
node.data: true

This you can add based on your requirement. I added it on all nodes.

8.8: Start Elasticsearch

systemctl start elasticsearch.service 

That’s it your cluster is ready. Now you need to check the cluster health and make sure that the cluster is ready for Production use.

Run the following curl call and make sure that the cluster status is Green:

curl http://10.22.28.112:9200/_cluster/health?pretty 

Yes, your cluster is ready to use now. I will create a separate article on basic commands (API calls) of Elasticsearch later.

Modern Monitoring Concepts – An Introduction To Prometheus World

One of the important thing in IT is maintaining the infra more reliable and companies are investigating a good amount of money for this. In modern world, the tools are sufficient to collect as many number of metrics as we need and we can create visualisations too. Modern systems can emit thousands or millions of metrics, and modern monitoring tools can collect them all.

But is this good to collect maximum number of metrics from servers or clusters, without knowing its actual power?!?!

Read more… https://www.crybit.com/modern-monitoring-concepts-an-intro-to-prometheus-world/

Ansible role for Elasticsearch is available and it’s easy to setup and manage the cluster using Ansible role. I will write a blog on that soon.

Thanks! Please let me know if you have any questions.

, ,

Post navigation

Arunlal A

Senior System Developer at Zeta. Linux lover. Traveller. Let's connect! Whether you're a seasoned DevOps pro or just starting your journey, I'm always eager to engage with like-minded individuals. Follow my blog for regular updates, connect on social media, and let's embark on this DevOps adventure together! Happy coding and deploying!

4 thoughts on “Setup/configure a three node Elasticsearch cluster on CentOS 7

  1. Just dropping a comment which might help folks trying out a multi-master setup: Make sure you have your /etc/hosts file in each of the ES nodes populated with entries of the entire cluster. You might end up with a “master not discovered exception” otherwise.

  2. For deciding the number of minimum master nodes config parameter, it is recommended to use this formula:
    (M / 2) + 1, rounded down to the nearest integer, where M is the number of nodes in the ES cluster.

    In our case, M=3 (3 nodes in our cluster)
    No. of min. masters = (3 / 2) + 1 = (1.5) + 1 = 2.5 => 2

  3. I didn’t get past step 2… I’m sure it’s easy enough to go and find the elasticsearch downloads as your URL for the curl is no longer valid, but trying to set up directories and changing ownership to a non-existent user just made me lose confidence for any of the content further along in the document… You really need somebody to check the accuracy and proof read your posts.

    1. Hey thanks for your comment.

      * URL for the curl is no longer valid

      It’s working. I added the link for repo. You can change that as you wish.

      * but trying to set up directories and changing ownership to a non-existent user just made me lose confidence for any of the content further along in the document

      That step wasn’t added first. I have updated the document with proper steps.
      Thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *