Welcome to our guide on how to install and setup GlusterFS on Ubuntu 22.04/Ubuntu 20.04. Well, so what is GlusterFS? GlusterFS is an opensource distributed and scalable network file system that clusters various disk storage resources into a single global namespace. GlusterFS is suitable for data-intensive tasks such as cloud storage and media streaming.
Install and Setup GlusterFS on Ubuntu 22.04/Ubuntu 20.04
While setting up GlusterFS, there are different types of Volume architectures that you may want to consider. These include;
- Distributed GlusterFS Volume
- Replicated GlusterFS Volume
- Distributed Replicated GlusterFS Volume
- Striped GlusterFS Volume
- Distributed Striped GlusterFS Volume
How to Setup Distributed GlusterFS Volume on Ubuntu 22.04/Ubuntu 20.04
In this guide, we are going to learn how to setup distributed GlusterFS
. With the distributed volume, files are distributed across various bricks in the volume such that file A is stored on one of the volumes and file B on the other. The purpose of this architecture is to cheaply scale the volume size. However, it doesn’t provide redundancy and a failure of the volume will lead to a complete loss of the data stored in that volume.
As a result, our environment consists of two storage nodes and a single client. Their details are as shown below.
Note, do not use the root partition for GlusterFS. Add another storage partition, preferably on a different drive.
- Storage Node 1:
- Hostname: gfs01.kifarunix-demo.com
- IP address: 192.168.57.6
- Gluster Storage Disk: /dev/sdb1
- Size: 4GB
- Mount Point: /gfsvolume
- OS: Ubuntu 22.04/Ubuntu 20.04
- Storage Node 2:
- Hostname: gfs02.kifarunix-demo.com
- IP address: 192.168.56.124
- Gluster Storage Disk: /dev/sdb1
- Size: 4GB
- Mount Point: /gfsvolume
- OS: Ubuntu 22.04/Ubuntu 20.04
- GlusterFS Client:
- Hostname: gfsclient.kifarunix-demo.com
- IP address: 192.168.43.197
- OS: Ubuntu 22.04/Ubuntu 20.04
- Ensure that the hostsnames are resolvable. If you dont have a DNS server, then populate the hosts file of each Server accordingly such that the three servers are reachable via the hostnames.
- Another thing to consider is the NTP server. Ensure that the time is synchronized for the three servers.
Update and upgrade your system packages.
apt update
Install GlusterFS Server on Ubuntu 22.04/Ubuntu 20.04 Nodes
GlusterFS packages is available on the default Ubuntu 22.04/Ubuntu 20.04. Run the command below to install GlusterFS server.
apt install glusterfs-server
Start and enable GlusterFS server (glusterd
) to run on system boot;
systemctl enable --now glusterd
Check the status;
systemctl status glusterd
● glusterd.service - GlusterFS, a clustered file-system server
Loaded: loaded (/lib/systemd/system/glusterd.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2022-06-02 17:59:13 UTC; 11s ago
Docs: man:glusterd(8)
Process: 3346 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid --log-level $LOG_LEVEL $GLUSTERD_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 3347 (glusterd)
Tasks: 9 (limit: 2241)
Memory: 7.1M
CPU: 1.394s
CGroup: /system.slice/glusterd.service
└─3347 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
Jun 02 17:59:11 gfs02.kifarunix-demo.com systemd[1]: Starting GlusterFS, a clustered file-system server...
Jun 02 17:59:13 gfs02.kifarunix-demo.com systemd[1]: Started GlusterFS, a clustered file-system server.
Setup Distributed GlusterFS Volume on Ubuntu 22.04/Ubuntu 20.04
If Firewall is running, run the command below to allow the Gluster storage nodes to communicate with each other via the Gluster daemon service port, 24007/TCP
ufw allow from <other-node-IP> to any port 24007 proto tcp comment "GlusterFS Management"
If you are using iptables;
iptables -A INPUT -s <other-node-IP> -p tcp --dport 24007 -j ACCEPT -m comment --comment "GlusterFS Management"
cp /etc/iptables/rules.v4{,.old}
iptables-save > /etc/iptables/rules.v4
Also allow GlusterFS clients to connect to GlusterFS daemon;
ufw allow from <Client-IP> to any port 24007 proto tcp comment "GlusterFS Client Access"
- Configure GlusterFS Trusted Pool
To create a trusted storage pool between the GlusterFS nodes, run the probe from GlusterFS Node01 as shown below;
gluster peer probe gfs02.kifarunix-demo.com
Sample output;
peer probe: success.
To check the status of the trusted pool just created above, execute the command below;
gluster peer status
Number of Peers: 1
Hostname: gfs02.kifarunix-demo.com
Uuid: b81803a8-893a-499e-9a87-6bac00a62822
State: Accepted peer request (Connected)
If you get State: Peer Rejected (Connected)
, see the resolution here.
Output from the second node;
Number of Peers: 1
Hostname: gfs01.kifarunix-demo.com
Uuid: 26fe538a-91c2-42a1-b34a-67c2c94c7492
State: Peer in Cluster (Connected)
To list the storage pools;
gluster pool list
UUID Hostname State
b81803a8-893a-499e-9a87-6bac00a62822 gfs02.kifarunix-demo.com Connected
26fe538a-91c2-42a1-b34a-67c2c94c7492 localhost Connected
- Create Distributed GlisterFS Volume
Create a brick directory for GlusterFS volumes on the GlusterFS storage device mount point on both storage nodes.
A Brick is the basic unit of storage in GlusterFS, represented by an export directory on a server in the trusted storage pool. A brick is expressed by combining a server with an export directory in the following format: `SERVER:EXPORT` For example: `myhostname:/exports/myexportdir/`.
Thus, on BOTH Nodes, create brick directory.
mkdir /gfsvolume/gv0
Please note that our GlusterFS disk, /dev/sdb1 is mounted on /gfsvolume directory.
df -hT -P /gfsvolume
Sample output;
Filesystem Type Size Used Avail Use% Mounted on
/dev/sdb1 ext4 3.9G 24K 3.7G 1% /gfsvolume
Next, create a distributed volume called distributed_vol
on the nodes. The name can be anything!
Run this command on either of the nodes, once.
gluster volume create distributed_vol transport tcp gfs01:/gfsvolume/gv0 gfs02:/gfsvolume/gv0
Sample output;
volume create: distributed_vol: success: please start the volume to access data
- Start the created volume.
You can now start the created volume brick.
gluster volume start distributed_vol
Sample output;
volume start: distributed_vol: success
- Show information about the created volume.
You can show information about created brick volume using the command below;
gluster volume info
Volume Name: distributed_vol
Type: Distribute
Volume ID: 98519652-97a2-4fb8-bd1a-9b6a83d8936e
Status: Started
Snapshot Count: 0
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: gfs01:/gfsvolume/gv0
Brick2: gfs02:/gfsvolume/gv0
Options Reconfigured:
transport.address-family: inet
storage.fips-mode-rchecksum: on
nfs.disable: on
- Get GlusterFS volume status;
gluster volume status
Status of volume: distributed_vol
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick gfs01:/gfsvolume/gv0 49152 0 Y 6110
Brick gfs02:/gfsvolume/gv0 60116 0 Y 4501
Task Status of Volume distributed_vol
------------------------------------------------------------------------------
There are no active volume tasks
- Open GlusterFS Volumes Ports on Firewall
In order for the clients to connect to the volumes created, you need to open the respective node volume port on firewall. The ports are shown above.
Also ensure that the nodes can talk to each other on these ports.
For example, on GlusterFS node 01, open port 49152 to allow clients to mount the volume.
ufw allow from <Client-IP-or-Network> to any port 49152 proto tcp comment "GlusterFS Client Access"
ufw allow from <Node02 IP> to any port 49152 proto tcp comment "GlusterFS Node02"
On Node 02;
ufw allow from <Client-IP-or-Network> to any port 60116 proto tcp comment "GlusterFS Client Access"
ufw allow from <Node01 IP> to any port 60116 proto tcp comment "GlusterFS Node01"
Install GlusterFS Client on Ubuntu 22.04/Ubuntu 20.04 Client
On Ubuntu 22.04/Ubuntu 20.04, run the command to install GlusterFS client on Ubuntu 22.04/Ubuntu 20.04;
apt update
apt install glusterfs-client
- Mount the GlusterFS Volume on GlusterFS client
We are going to use the native GlusterFS client to mount the GlusterFS nodes.
Create the mount point
mkdir /mnt/gfsvol
Mount the distributed volume. If using domain names, ensure they are resolvable.
mount -t glusterfs gfs01:/distributed_vol /mnt/gfsvol/
Run the df command to check the mounted filesystems.
df -hTP /mnt/gfsvol/
Filesystem Type Size Used Avail Use% Mounted on
gfs01:/distributed_vol fuse.glusterfs 7.8G 97M 7.3G 2% /mnt/gfsvol
From other clients, you can mount the volume on the other node;
mount -t glusterfs gfs02:/distributed_vol /mnt/gfsvol/
To auto-mount the volume on system boot, you need to add the line below to /etc/fstab
.
gfs01:/distributed_vol /mnt/gfsvol glusterfs defaults,_netdev 0 0
To test the data distribution, create two test files on the client. One of the file will be stored one of the volumes and the other file on the other volume. see example below;
mkdir /mnt/gfsvol/Test-dir
touch /mnt/gfsvol/Test-dir/{test-file,test-file-two}
If you can check on node01,
ls /gfsvolume/gv0/Test-dir/
test-file-two
On node02,
ls /gfsvolume/gv0/Test-dir/
test-file
That concludes our guide on setting up GlusterFS on Ubuntu 22.04/Ubuntu 20.04, specifically how to setup distributed GlusterFS volumes. In our next tutorial, we will learn how to setup replicated GlusterFS volumes.
Other Tutorials
Install and Configure NFS Server on Rocky Linux 8
Easily Install and Configure Samba File Server on Ubuntu 22.04
This is fine and all but mounting the gulster volume flat out does not work all the time. There is a serious bug/issue preventing the mounting of the gluster volume reliably after reboots.
Getting gluster-server to work is fine. Getting gluster-client to work is not possible.
I hve been pounding my head into my desk for the last two weeks trying to solve it.