Welcome to our guide on how to install and setup GlusterFS on Ubuntu 18.04. Well, so what is GlusterFS? GlusterFS is an opensource distributed and scalable network file system that clusters various disk storage resources into a single global namespace. GlusterFS is suitable for data-intensive tasks such as cloud storage and media streaming. Some of the common features for GlusterFS include;
- Can scale to several petabytes thus can handle thousands of clients.
- Provides high availability through data mirroring. It also supports self healing mechanism that restores data to the correct state following recovery with nearly no overhead.
- Provides a unified global namespace that clusters disk and memory resources into a single pool that ensures load balanced I/O.
- Uses elastic hash algorithm to locate data in the storage pool hence linear performance scaling.
- Provides elastic volume manager in which data is stored in logical volumes that are abstracted from the hardware and logically partitioned from each other. This ensures that storage can be added or removed while data continues to be online with no application interruption.
- Gluster is fully POSIX-compliant and does not require any unique APIs for data access.
- Supports industry standard protocols like NFS, SMB, CIFS, HTTP and FTP.
Installing GlusterFS on Ubuntu 18.04
While setting up GlusterFS, there are different types of Volume architectures that you may want to consider. These include;
- Distributed Glusterfs Volume
- Replicated Glusterfs Volume
- Distributed Replicated Glusterfs Volume
- Striped Glusterfs Volume
- Distributed Striped Glusterfs Volume
In this guide, we are going to learn how to setup distributed GlusterFS
. With the distributed volume, files are distributed across various bricks in the volume such that file A is stored on one of the volumes and file B on the other. The purpose of this architecture is to cheaply scale the volume size. However, it doesn’t provide redundancy and a failure of the volume will lead to a complete loss of the data stored in that volume.
As a result, our environment consists of two storage nodes and a single client. Their details are as shown below.
- Storage Node 1:
- Hostname: gfsnode01.example.com
- IP address: 192.168.43.30
- Gluster Storage Disk: /dev/sdb1
- Mount Point: /gfsvolume
- OS: Ubuntu 18.04
- Storage Node 2:
- Hostname: gfsnode02.example.com
- IP address: 192.168.43.177
- Gluster Storage Disk: /dev/sdb1
- Mount Point: /gfsvolume
- OS: Ubuntu 18.04
- GlusterFS Client:
- Hostname: gfsclient.example.com
- IP address: 192.168.43.197
- OS: Ubuntu 18.04
Ensure that the hostsnames are resolvable. If you dont have a DNS server, then populate the hosts file of each Server accordingly such that the three servers are reachable via the hostnames.
Another thing to consider is the NTP server. Ensure that the time is synchronized for the three servers.
Update and upgrade your system packages.
apt update
Install GlusterFS Server on Ubuntu 18.04 Nodes
GlusterFS-3 is available on the default Ubuntu 18.04. Hence, to install the latest release, GlisterFS-5, you need to add the glusterfs-5 PPA repository.
sudo apt-get install software-properties-common
sudo add-apt-repository ppa:gluster/glusterfs-5
Once you add the PPA repository, update the systems.
apt update
Install GlusterFS-5 server on both nodes;
apt install glusterfs-server
GlusterFS server (glusterd
) is set to run by default after installation. Enable it to run on system boot.
systemctl enable glusterd
Install GlusterFS Client on Ubuntu 18.04 Client
Add the PPA repositories
apt-get install software-properties-common
add-apt-repository ppa:gluster/glusterfs-5
Install GlusterFS-5 client on the Client server;
apt install glusterfs-client
Setup Distributed GlusterFS Volume on Ubuntu 18.04
Configure Firewall
If UFW is running, run the command below to allow the storage nodes to communicate with each other.
sudo ufw allow from <other-node-IP>
Configure GlusterFS Trusted Pool
To create a trusted storage pool between the nodes, run the probe from Storage Node01 as shown below;
gluster peer probe u18svrnode02.example.com
peer probe: success.
To check the status of the trusted pool just created above, execute the command below;
gluster peer status
Number of Peers: 1
Hostname: u18svrnode02.example.com
Uuid: 5a2dd392-9e3b-4710-8803-e6055694a955
State: Peer in Cluster (Connected)
If you get State: Peer Rejected (Connected)
, see the resolution here.
To list the storage pools;
gluster pool list
UUID Hostname State
5a2dd392-9e3b-4710-8803-e6055694a955 u18svrnode02.example.com Connected
639199cd-575a-441b-996b-313c5ab703bd localhost Connected
Create Distributed GlisterFS Volume
Create a brick directory for GlusterFS volumes on the GlusterFS storage device mount point on both storage nodes.
mkdir /gfsvolume/gv0
Next, create a distributed volume called distributed_vol
on both nodes
gluster volume create distributed_vol
transport tcp u18svrnode01:/gfsvolume/gv0 u18svrnode02:/gfsvolume/gv0
volume create: distributed_vol: success: please start the volume to access data
Start the created volume.
gluster volume start distributed_vol
volume start: distributed_vol: success
Show information about the created volume.
gluster volume info
Volume Name: distributed_vol
Type: Distribute
Volume ID: acc2cf8f-5177-4e2e-9772-dc0f1b791abe
Status: Started
Snapshot Count: 0
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: u18svrnode01:/gfsvolume/gv0
Brick2: u18svrnode02:/gfsvolume/gv0
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
Mount the Volume on GlusterFS client
We are going to use the native GlusterFS client to mount the GlusterFS nodes.
Create the mount point
mkdir /mnt/gfsvol
Mount the distributed volume.
mount -t glusterfs u18svrnode01.example.com:/distributed_vol /mnt/gfsvol/
Run the df command to check the mounted filesystems.
df -hTP /mnt/gfsvol/
Filesystem Type Size Used Avail Use% Mounted on
u18svrnode01.example.com:/distributed_vol fuse.glusterfs 2.0G 88M 2.0G 5% /mnt/gfsvol
To automount the volume on system boot, you need to add the line below to /etc/fstab
.
u18svrnode01.example.com:/distributed_vol /mnt/gfsvol glusterfs defaults,_netdev 0 0
To test the data distribution, create two test files on the client. One of the file will be stored one of the volumes and the other file on the other volume. see example below;
mkdir /mnt/gfsvol/Test-dir
touch /mnt/gfsvol/Test-dir/test-file
touch /mnt/gfsvol/Test-dir/test-file-two
If you can check on node01,
ls /gfsvolume/gv0/Test-dir/
test-file-two
On node02,
ls /gfsvolume/gv0/Test-dir/
test-file
Well, In this guide, you have learnt how setup distributed GlusterFS volumes and verified to be working. In our next tutorial, we will learn how to setup replicated Glusterfs volumes.