This tutorial provides a step by step guide on how to install and setup Kubernetes Cluster on Ubuntu 24.04. Kubernetes, according to kubernetes.io, is an open-source production-grade container orchestration platform. It facilitates automated deployment, scaling and management of containerized applications.
Install and Setup Kubernetes Cluster on Ubuntu 24.04
Table of Contents
Kubernetes Cluster Architecture
In this tutorial, we are going install and setup a four node (one control plane and three worker nodes) Kubernetes cluster.
A Kubernetes cluster is composed a Master node which hosts the control plane and a Worker node which hosts Pods.
Check our guide on a high-level overview of Kubernetes cluster to understand more on this.
Kubernetes Architecture: A High-level Overview of Kubernetes Cluster Components
Below are our node details.Node Hostname IP Address vCPUs RAM (GB) OS Master master.kifarunix.com 192.168.122.60 2 2 Ubuntu 24.04 Worker 1 wk01.kifarunix.com 192.168.122.61 2 2 Ubuntu 24.04 Worker 2 wk02.kifarunix.com 192.168.122.62 2 2 Ubuntu 24.04 Worker 3 wk03.kifarunix.com 192.168.122.63 2 2 Ubuntu 24.04
Run System Update on Cluster Nodes
To begin with, update system package cache on all the nodes;
sudo apt update
Disable Swap on Cluster Nodes
Running Kubernetes requires that you disable swap.
Check if swap is enabled.
swapon --show
NAME TYPE SIZE USED PRIO
/swap.img file 2G 0B -2
If there is no output, then swap is not enabled. If it is enabled as shown in the output above, run the command below to disable it.
sudo swapoff -v /swap.img
Or simply
sudo swapoff -a
To permanently disable swap, comment out or remove the swap line on /etc/fstab file.
sudo sed -i '/swap/s/^/#/' /etc/fstab
or Simply remove it;
sed "-i.bak" '/swap/d' /etc/fstab
Enable Kernel IP forwarding on Cluster Nodes
In order to permit the communication between Pods across different networks, the system should able to route traffic between them. This can be achieved by enabling IP forwarding. Without IP forwarding, containers won’t be able to communicate with resources outside of their network namespace, which would limit their functionality and utility.
To enable IP forwarding, set the value of net.ipv4.ip_forward
to 1
.
echo "net.ipv4.ip_forward=1" | sudo tee -a /etc/sysctl.conf
Apply the changes;
sudo sysctl -p
Load Some Required Kernel Modules on Cluster Nodes
overlay
module provides support for the overlay filesystem. OverlayFS is type of union filesystem used by container runtimes to layer the container’s root filesystem over the host filesystem.
br_netfilter
module provides support for packet filtering in Linux bridge networks based on various criteria, such as source and destination IP address, port numbers, and protocol type.
Check if these modules are enabled/loaded;
sudo lsmod | grep -E "overlay|br_netfilter"
br_netfilter 32768 0
bridge 307200 1 br_netfilter
overlay 151552 9
If not loaded, just load them as follows;
echo 'overlay
br_netfilter' | sudo tee /etc/modules-load.d/kubernetes.conf
sudo modprobe overlay
sudo modprobe br_netfilter
Similarly, enable Linux kernel’s bridge netfilter to pass bridge traffic to iptables for filtering. This means that the packets that are bridged between network interfaces can be filtered using iptables/ip6tables, just as if they were routed packets.
sudo tee -a /etc/sysctl.conf << 'EOL'
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOL
Apply the changes;
sudo sysctl -p
Install Container Runtime on Ubuntu 24.04
Kubernetes uses container runtime to run containers in Pods. It supports multiple container runtimes including Docker Engine, containerd, CRI-O, Mirantis Container Runtime.
Install Containerd Runtime on all Cluster Nodes
In this demo, we will use containerd runtime. Therefore, on all nodes, master and workers, you need to install containerd runtime.
You can install containerd using official binaries or from the Docker Engine APT repos. We will use the later in this guide, thus;
sudo apt install apt-transport-https \
ca-certificates curl \
gnupg-agent \
software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | \
sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/docker.gpg
echo "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -sc) stable" | sudo tee /etc/apt/sources.list.d/docker-ce.list
sudo apt update
Install containerd;
sudo apt install -y containerd.io
The kubelet automatically detects the container runtime present on the node and uses it to run the containers.
Configure Cgroup Driver for ContainerD
Cgroup (control groups) is a Linux kernel feature that allows for the isolation, prioritization, and monitoring of system resources like CPU, memory, and disk I/O for a group of processes. Kubernetes (kubelet and container runtime such as containerd) uses cgroup drivers to interface with control groups in order to manage and set limit for the resources allocated to the containers.
Kubernetes support three types of Cgroup drivers;
cgroupfs
(control groups filesystem): This is the default cgroup driver used by Kubernetes kubelet to manage resources for containers.systemd
: This is the default initialization system and service manager in some Linux systems. it offers functions such as starting of daemons, keeping track of processes using Linux cgroups etc.
For systems that use Systemd as their default Init system, it is recommended to use systemd cgroup driver for Kubernetes instead of cgroupfs.
The default configuration file for containerd is /etc/containerd/config.toml
. When containerd is installed from Docker APT repos, this file is created with little configs. If installed from the official binaries, the containerd confguration file is not created.
Either way, update the containerd configuration file by executing the command below;
[ -d /etc/containerd ] || sudo mkdir /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml
Sample configuration.
disabled_plugins = []
imports = []
oom_score = 0
plugin_dir = ""
required_plugins = []
root = "/var/lib/containerd"
state = "/run/containerd"
temp = ""
version = 2
[cgroup]
path = ""
[debug]
address = ""
format = ""
gid = 0
level = ""
uid = 0
[grpc]
address = "/run/containerd/containerd.sock"
gid = 0
max_recv_message_size = 16777216
max_send_message_size = 16777216
tcp_address = ""
tcp_tls_ca = ""
tcp_tls_cert = ""
tcp_tls_key = ""
uid = 0
[metrics]
address = ""
grpc_histogram = false
[plugins]
[plugins."io.containerd.gc.v1.scheduler"]
deletion_threshold = 0
mutation_threshold = 100
pause_threshold = 0.02
schedule_delay = "0s"
startup_delay = "100ms"
[plugins."io.containerd.grpc.v1.cri"]
device_ownership_from_security_context = false
disable_apparmor = false
disable_cgroup = false
disable_hugetlb_controller = true
disable_proc_mount = false
disable_tcp_service = true
enable_selinux = false
enable_tls_streaming = false
enable_unprivileged_icmp = false
enable_unprivileged_ports = false
ignore_image_defined_volumes = false
max_concurrent_downloads = 3
max_container_log_line_size = 16384
netns_mounts_under_state_dir = false
restrict_oom_score_adj = false
sandbox_image = "registry.k8s.io/pause:3.6"
selinux_category_range = 1024
stats_collect_period = 10
stream_idle_timeout = "4h0m0s"
stream_server_address = "127.0.0.1"
stream_server_port = "0"
systemd_cgroup = false
tolerate_missing_hugetlb_controller = true
unset_seccomp_profile = ""
[plugins."io.containerd.grpc.v1.cri".cni]
bin_dir = "/opt/cni/bin"
conf_dir = "/etc/cni/net.d"
conf_template = ""
ip_pref = ""
max_conf_num = 1
[plugins."io.containerd.grpc.v1.cri".containerd]
default_runtime_name = "runc"
disable_snapshot_annotations = true
discard_unpacked_layers = false
ignore_rdt_not_enabled_errors = false
no_pivot = false
snapshotter = "overlayfs"
[plugins."io.containerd.grpc.v1.cri".containerd.default_runtime]
base_runtime_spec = ""
cni_conf_dir = ""
cni_max_conf_num = 0
container_annotations = []
pod_annotations = []
privileged_without_host_devices = false
runtime_engine = ""
runtime_path = ""
runtime_root = ""
runtime_type = ""
[plugins."io.containerd.grpc.v1.cri".containerd.default_runtime.options]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
base_runtime_spec = ""
cni_conf_dir = ""
cni_max_conf_num = 0
container_annotations = []
pod_annotations = []
privileged_without_host_devices = false
runtime_engine = ""
runtime_path = ""
runtime_root = ""
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
BinaryName = ""
CriuImagePath = ""
CriuPath = ""
CriuWorkPath = ""
IoGid = 0
IoUid = 0
NoNewKeyring = false
NoPivotRoot = false
Root = ""
ShimCgroup = ""
SystemdCgroup = false
[plugins."io.containerd.grpc.v1.cri".containerd.untrusted_workload_runtime]
base_runtime_spec = ""
cni_conf_dir = ""
cni_max_conf_num = 0
container_annotations = []
pod_annotations = []
privileged_without_host_devices = false
runtime_engine = ""
runtime_path = ""
runtime_root = ""
runtime_type = ""
[plugins."io.containerd.grpc.v1.cri".containerd.untrusted_workload_runtime.options]
[plugins."io.containerd.grpc.v1.cri".image_decryption]
key_model = "node"
[plugins."io.containerd.grpc.v1.cri".registry]
config_path = ""
[plugins."io.containerd.grpc.v1.cri".registry.auths]
[plugins."io.containerd.grpc.v1.cri".registry.configs]
[plugins."io.containerd.grpc.v1.cri".registry.headers]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".x509_key_pair_streaming]
tls_cert_file = ""
tls_key_file = ""
[plugins."io.containerd.internal.v1.opt"]
path = "/opt/containerd"
[plugins."io.containerd.internal.v1.restart"]
interval = "10s"
[plugins."io.containerd.internal.v1.tracing"]
sampling_ratio = 1.0
service_name = "containerd"
[plugins."io.containerd.metadata.v1.bolt"]
content_sharing_policy = "shared"
[plugins."io.containerd.monitor.v1.cgroups"]
no_prometheus = false
[plugins."io.containerd.runtime.v1.linux"]
no_shim = false
runtime = "runc"
runtime_root = ""
shim = "containerd-shim"
shim_debug = false
[plugins."io.containerd.runtime.v2.task"]
platforms = ["linux/amd64"]
sched_core = false
[plugins."io.containerd.service.v1.diff-service"]
default = ["walking"]
[plugins."io.containerd.service.v1.tasks-service"]
rdt_config_file = ""
[plugins."io.containerd.snapshotter.v1.aufs"]
root_path = ""
[plugins."io.containerd.snapshotter.v1.btrfs"]
root_path = ""
[plugins."io.containerd.snapshotter.v1.devmapper"]
async_remove = false
base_image_size = ""
discard_blocks = false
fs_options = ""
fs_type = ""
pool_name = ""
root_path = ""
[plugins."io.containerd.snapshotter.v1.native"]
root_path = ""
[plugins."io.containerd.snapshotter.v1.overlayfs"]
root_path = ""
upperdir_label = false
[plugins."io.containerd.snapshotter.v1.zfs"]
root_path = ""
[plugins."io.containerd.tracing.processor.v1.otlp"]
endpoint = ""
insecure = false
protocol = ""
[proxy_plugins]
[stream_processors]
[stream_processors."io.containerd.ocicrypt.decoder.v1.tar"]
accepts = ["application/vnd.oci.image.layer.v1.tar+encrypted"]
args = ["--decryption-keys-path", "/etc/containerd/ocicrypt/keys"]
env = ["OCICRYPT_KEYPROVIDER_CONFIG=/etc/containerd/ocicrypt/ocicrypt_keyprovider.conf"]
path = "ctd-decoder"
returns = "application/vnd.oci.image.layer.v1.tar"
[stream_processors."io.containerd.ocicrypt.decoder.v1.tar.gzip"]
accepts = ["application/vnd.oci.image.layer.v1.tar+gzip+encrypted"]
args = ["--decryption-keys-path", "/etc/containerd/ocicrypt/keys"]
env = ["OCICRYPT_KEYPROVIDER_CONFIG=/etc/containerd/ocicrypt/ocicrypt_keyprovider.conf"]
path = "ctd-decoder"
returns = "application/vnd.oci.image.layer.v1.tar+gzip"
[timeouts]
"io.containerd.timeout.bolt.open" = "0s"
"io.containerd.timeout.shim.cleanup" = "5s"
"io.containerd.timeout.shim.load" = "5s"
"io.containerd.timeout.shim.shutdown" = "3s"
"io.containerd.timeout.task.state" = "2s"
[ttrpc]
address = ""
gid = 0
uid = 0
Once you generate the default config, you need to enable systemd cgroup for the containerd low-level container runtime, runc
by changing the value of SystemdCgroup
from false
to true
.
sudo sed -i '/SystemdCgroup/s/false/true/' /etc/containerd/config.toml
Also, as of this writing, it is recommended to use “registry.k8s.io/pause:3.9” as the CRI sandbox image. pause container image is a minimalistic container image that enables containerd to provide network isolation for pods in Kubernetes. Containerd uses pause:3.8.
grep sandbox_image /etc/containerd/config.toml
sandbox_image = "registry.k8s.io/pause:3.8"
To change this to pause:3.9;
sudo sed -i '/pause:3.8/s/3.8/3.9/' /etc/containerd/config.toml
If the default version is other than 3.8, then adjust the number accordingly.
Verify the changes again;
grep sandbox_image /etc/containerd/config.toml
sandbox_image = "registry.k8s.io/pause:3.9"
Start and enable containerd to run on system boot;
sudo systemctl enable --now containerd
Confirm the status;
systemctl status containerd
● containerd.service - containerd container runtime
Loaded: loaded (/usr/lib/systemd/system/containerd.service; enabled; preset: enabled)
Active: active (running) since Mon 2024-05-13 17:50:46 UTC; 3min 45s ago
Docs: https://containerd.io
Main PID: 2826 (containerd)
Tasks: 8
Memory: 13.5M (peak: 14.0M)
CPU: 1.303s
CGroup: /system.slice/containerd.service
└─2826 /usr/bin/containerd
May 13 17:50:46 master.kifarunix.com containerd[2826]: time="2024-05-13T17:50:46.057491119Z" level=info msg="Start subscribing containerd event"
May 13 17:50:46 master.kifarunix.com containerd[2826]: time="2024-05-13T17:50:46.057526439Z" level=info msg="Start recovering state"
May 13 17:50:46 master.kifarunix.com containerd[2826]: time="2024-05-13T17:50:46.057810777Z" level=info msg="Start event monitor"
May 13 17:50:46 master.kifarunix.com containerd[2826]: time="2024-05-13T17:50:46.057825824Z" level=info msg="Start snapshots syncer"
May 13 17:50:46 master.kifarunix.com containerd[2826]: time="2024-05-13T17:50:46.057831474Z" level=info msg="Start cni network conf syncer for default"
May 13 17:50:46 master.kifarunix.com containerd[2826]: time="2024-05-13T17:50:46.057833164Z" level=info msg=serving... address=/run/containerd/containerd.sock.ttrpc
May 13 17:50:46 master.kifarunix.com containerd[2826]: time="2024-05-13T17:50:46.057840998Z" level=info msg="Start streaming server"
May 13 17:50:46 master.kifarunix.com containerd[2826]: time="2024-05-13T17:50:46.057873084Z" level=info msg=serving... address=/run/containerd/containerd.sock
May 13 17:50:46 master.kifarunix.com systemd[1]: Started containerd.service - containerd container runtime.
May 13 17:50:46 master.kifarunix.com containerd[2826]: time="2024-05-13T17:50:46.059380011Z" level=info msg="containerd successfully booted in 0.019204s"
Install Kubernetes on Ubuntu 24.04
There are a number of node components required to provide Kubernetes runtime environment that needs to be installed on each node. These include:
kubelet
: runs as an agent on each worker node and ensures that containers are running in a Pod.kubeadm
: Bootstraps Kubernetes clusterkubectl
: Used to run commands against Kubernetes clusters.
These components are not available on the default Ubuntu repos. Thus, you need to install Kubernetes repos to install them.
Install Kubernetes Repository GPG Signing Key
Run the command below to install Kubernetes repo GPG key.
sudo apt install gnupg2 -y
Replace the value of the VER variable below with the release number of Kubernetes you need to run! In this guide, I will be using the current latest minor release version, v1.30.
VER=1.30
curl -fsSL https://pkgs.k8s.io/core:/stable:/v${VER}/deb/Release.key | \
sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/k8s.gpg
Install Kubernetes Repository on Ubuntu 24.04
Next install the Kubernetes repository of the version matching the GPG key installed above;
echo "deb https://pkgs.k8s.io/core:/stable:/v${VER}/deb/ /" | sudo tee /etc/apt/sources.list.d/kurbenetes.list
Install Kubernetes components on all the nodes
sudo apt update
sudo apt install kubelet kubeadm kubectl -y
Mark Hold Kubernetes Packages
In order to maintain the stability of the cluster, it is important to maintain specific versions of critical packages like kubeadm
, kubelet
, and kubectl
. This can be done by instructing the package management system (APT) to prevent those packages from being upgraded using the apt-mark hold command.
sudo apt-mark hold kubeadm kubelet kubectl
To check whether packages are on hold or not, you can use apt-mark showhold
:
sudo apt-mark showhold
If you want to allow apt
to upgrade these packages again, you can remove the hold using apt-mark unhold
:
sudo apt-mark unhold kubeadm kubelet kubectl
Initialize Kubernetes Cluster on Control Plane using Kubeadm
Once the above steps are completed, initialize the Kubernetes cluster on the master node. The Kubernetes master is responsible for maintaining the desired state for your cluster.
We will be using kubeadm tool to deploy our K8S cluster.
The cluster can be initiated using the kubeadm tool by passing the init
command argument;
kubeadm init <args>
Some of the common arguments/options include;
- –apiserver-advertise-address: Defines the IP address the API Server will listen on. If not set the default network interface will be used. An example usage is
--apiserver-advertise-address=192.168.122.60
. - –pod-network-cidr: Specify range of IP addresses for the pod network. If set, the control plane will automatically allocate CIDRs for every node. You use this to define your preferred network range if there is a chance for collision between your network plugin’s preferred Pod network addon and some of your host networks to happen e.g
--pod-network-cidr=10.100.0.0/16
. - –control-plane-endpoint: Specifies the hostname and port that the API server will listen on. This is recommended over the use of
--apiserver-advertise-address
because it enables you to define a shared endpoint such as load balance DNS name or an IP address that can be used when you upgrade single master node to highly available node. For example,--control-plane-endpoint=cluster.kifarunix-demo.com:6443
.
Since we are just running a single master node Kubernetes cluster in this guide (for demo purposes), with no plans to upgrade to highly available cluster, then we will specify just the IP address of the control plane while bootstrapping our cluster.
Thus, run the command below on the master node to bootstrap the Kubernetes control-plane node.
sudo kubeadm init --apiserver-advertise-address=192.168.122.60 --pod-network-cidr=10.100.0.0/16
The command will start by pre-pulling (kubeadm config images pull
) the required container images for a Kubernetes cluster before initializing the cluster.
Once the initialization is done, you should be able to see an output similar to the one below;
[init] Using Kubernetes version: v1.30.0
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local master.kifarunix.com] and IPs [10.96.0.1 192.168.122.60]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost master.kifarunix.com] and IPs [192.168.122.60 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost master.kifarunix.com] and IPs [192.168.122.60 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "super-admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests"
[kubelet-check] Waiting for a healthy kubelet. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 501.313365ms
[api-check] Waiting for a healthy API server. This can take up to 4m0s
[api-check] The API server is healthy after 4.003829486s
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node master.kifarunix.com as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node master.kifarunix.com as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: 2ntlip.lsw3yriy62bs16pp
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.122.60:6443 --token 2ntlip.lsw3yriy62bs16pp \
--discovery-token-ca-cert-hash sha256:a22e5b78b50c54af7de5390ec804b311d28ea40048d9c6b66ee21660bbe4d212
As suggested on the output above, you need to run the commands provided on the master node to start using your cluster.
Be sure to run the commands as regular user (recommended), with sudo rights.
Thus, if you are root, then switch to regular user with sudo rights (kifarunix is our regular, it could be a different user for you)
su - kifarunix
Next, create a Kubernetes cluster directory.
mkdir -p $HOME/.kube
Copy Kubernetes admin configuration file to the cluster directory created above.
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
Set the proper ownership for the cluster configuration file.
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Verify the status of the Kubernetes cluster;
kubectl get nodes
NAME STATUS ROLES AGE VERSION
master.kifarunix.com NotReady control-plane 8m14s v1.30.0
As you can see, the cluster is not ready yet.
You can also get the address of the control plane and cluster services;
kubectl cluster-info
Kubernetes control plane is running at https://192.168.122.60:6443
CoreDNS is running at https://192.168.122.60:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
Install Pod Network Addon on Master Node
A Pod is a group of one or more related containers in a Kubernetes cluster. They share the same lifecycle, storage/network. For Pods to communicate with one another, you must deploy a Container Network Interface (CNI) based Pod network add-on.
There are multiple Pod network addons that you can choose from. Refer to Addons page for more information.
To deploy a CNI Pod network, run the command below on the master node;
kubectl apply -f [podnetwork].yaml
Where [podnetwork].yaml
is the path to your preferred CNI YAML file. In this demo, we will use Calico network plugin.
Install Calico Pod network addon Operator by running the command below. Execute the command as the user with which you created the Kubernetes cluster.
Current release version is v3.28.0.
Get the current release version from releases page and replace the value of CNI_VER below.
CNI_VER=3.28.0
kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v${CNI_VER}/manifests/tigera-operator.yaml
namespace/tigera-operator created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgpfilters.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/caliconodestatuses.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipreservations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/kubecontrollersconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/apiservers.operator.tigera.io created
customresourcedefinition.apiextensions.k8s.io/imagesets.operator.tigera.io created
customresourcedefinition.apiextensions.k8s.io/installations.operator.tigera.io created
customresourcedefinition.apiextensions.k8s.io/tigerastatuses.operator.tigera.io created
serviceaccount/tigera-operator created
clusterrole.rbac.authorization.k8s.io/tigera-operator created
clusterrolebinding.rbac.authorization.k8s.io/tigera-operator created
deployment.apps/tigera-operator created
Next, download the custom resources necessary to configure Calico. The default network for Calico plugin is 192.168.0.0/16. If you used custom pod CIDR
as defined above (–pod-network-cidr=10.100.0.0/16), download the custom resource file and modify the network to match your custom one.
We will the manifest of the same version of CNI above.
wget https://raw.githubusercontent.com/projectcalico/calico/v${CNI_VER}/manifests/custom-resources.yaml
cat custom-resources.yaml
# This section includes base Calico installation configuration.
# For more information, see: https://docs.tigera.io/calico/latest/reference/installation/api#operator.tigera.io/v1.Installation
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
name: default
spec:
# Configures Calico networking.
calicoNetwork:
ipPools:
- name: default-ipv4-ippool
blockSize: 26
cidr: 192.168.0.0/16
encapsulation: VXLANCrossSubnet
natOutgoing: Enabled
nodeSelector: all()
---
# This section configures the Calico API server.
# For more information, see: https://docs.tigera.io/calico/latest/reference/installation/api#operator.tigera.io/v1.APIServer
apiVersion: operator.tigera.io/v1
kind: APIServer
metadata:
name: default
spec: {}
The network section of the custom resource file will now look like below by default;
- blockSize: 26
cidr: 192.168.0.0/16
Update the network subnet to match your subnet.
sed -i 's/192.168/10.100/' custom-resources.yaml
Apply the changes
kubectl create -f custom-resources.yaml
Sample output;
installation.operator.tigera.io/default created
apiserver.operator.tigera.io/default created
Get Running Pods in the Kubernetes cluster
Once the command completes, you can list the Pods in the namespaces by running the command below;
kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
calico-system calico-kube-controllers-795f68c999-xg8l6 0/1 ContainerCreating 0 31s
calico-system calico-node-phgxz 0/1 Running 0 24s
calico-system calico-typha-5c68d5959d-kkxwc 1/1 Terminating 0 32s
calico-system calico-typha-5ff5999dc9-vr2q9 0/1 Pending 0 31s
calico-system csi-node-driver-fjxfd 0/2 ContainerCreating 0 32s
kube-system coredns-7db6d8ff4d-762fg 0/1 Running 0 24m
kube-system coredns-7db6d8ff4d-flnrq 0/1 Running 0 24m
kube-system etcd-master.kifarunix.com 1/1 Running 2 24m
kube-system kube-apiserver-master.kifarunix.com 1/1 Running 2 24m
kube-system kube-controller-manager-master.kifarunix.com 1/1 Running 2 24m
kube-system kube-proxy-jv9x9 1/1 Running 0 24m
kube-system kube-scheduler-master.kifarunix.com 1/1 Running 2 24m
tigera-operator tigera-operator-7d5cd7fcc8-7bw5j 1/1 Running 0 12m
You can list Pods on specific namespaces;
kubectl get pods -n calico-system
NAMESPACE NAME READY STATUS RESTARTS AGE
calico-system calico-kube-controllers-795f68c999-xg8l6 0/1 ContainerCreating 0 31s
calico-system calico-node-phgxz 0/1 Running 0 24s
calico-system calico-typha-5c68d5959d-kkxwc 1/1 Terminating 0 32s
calico-system calico-typha-5ff5999dc9-vr2q9 0/1 Pending 0 31s
calico-system csi-node-driver-fjxfd 0/2 ContainerCreating 0 32s
As can be seen, all Pods on calico-system namespace are running
.
Open Kubernetes Cluster Ports on Firewall
If firewall is running on the nodes, then there are some ports that needs to be opened on the firewall;
Control Plane ports;Protocol Direction Port Range Purpose Used By TCP Inbound 6443 Kubernetes API server All TCP Inbound 2379-2380 etcd server client API kube-apiserver, etcd TCP Inbound 10250 Kubelet API Self, Control plane TCP Inbound 10259 kube-scheduler Self TCP Inbound 10257 kube-controller-manager Self
So the ports that should be open and accessible from outside the node are:
6443
– Kubernetes API Server (secure port)2379-2380
– etcd server client API10250
– Kubelet API10251
– kube-scheduler10252
– kube-controller-manager
In my setup, I am using UFW. Hence, you only need to open the ports below on Master/Control Plane;
for i in 6443 2379:2380 10250:10252; do sudo ufw allow from any to any port $i proto tcp; done
You can restrict access to the API from specific networks/IPS.
Worker Nodes;Protocol Direction Port Range Purpose Used By TCP Inbound 10250 Kubelet API Self, Control plane TCP Inbound 30000-32767 NodePort Services All
On each Woker node, open the Kubelete API port;
ufw allow from any to any port 10250 proto tcp comment "Open Kubelet API port"
You can restrict access to the API from specific networks/IPs.
Add Worker Nodes to Kubernetes Cluster
You can now add Worker nodes to the Kubernetes cluster using the kubeadm join command as follows.
Before that, ensure that container runtime is installed, configured and running. We are using containerd CRI;
systemctl status containerd
Sample output from worker01 node;
● containerd.service - containerd container runtime
Loaded: loaded (/usr/lib/systemd/system/containerd.service; enabled; preset: enabled)
Active: active (running) since Mon 2024-05-13 20:38:59 UTC; 20h ago
Docs: https://containerd.io
Process: 4430 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
Main PID: 4432 (containerd)
Tasks: 8
Memory: 13.4M (peak: 13.9M)
CPU: 1min 45.227s
CGroup: /system.slice/containerd.service
└─4432 /usr/bin/containerd
May 13 20:38:59 wk01.kifarunix.com containerd[4432]: time="2024-05-13T20:38:59.786586300Z" level=info msg="Start subscribing containerd event"
May 13 20:38:59 wk01.kifarunix.com containerd[4432]: time="2024-05-13T20:38:59.786651551Z" level=info msg="Start recovering state"
May 13 20:38:59 wk01.kifarunix.com containerd[4432]: time="2024-05-13T20:38:59.786806494Z" level=info msg=serving... address=/run/containerd/containerd.sock.ttrpc
May 13 20:38:59 wk01.kifarunix.com containerd[4432]: time="2024-05-13T20:38:59.786874950Z" level=info msg="Start event monitor"
May 13 20:38:59 wk01.kifarunix.com containerd[4432]: time="2024-05-13T20:38:59.786931542Z" level=info msg="Start snapshots syncer"
May 13 20:38:59 wk01.kifarunix.com containerd[4432]: time="2024-05-13T20:38:59.786886363Z" level=info msg=serving... address=/run/containerd/containerd.sock
May 13 20:38:59 wk01.kifarunix.com containerd[4432]: time="2024-05-13T20:38:59.786971950Z" level=info msg="Start cni network conf syncer for default"
May 13 20:38:59 wk01.kifarunix.com containerd[4432]: time="2024-05-13T20:38:59.787003012Z" level=info msg="Start streaming server"
May 13 20:38:59 wk01.kifarunix.com containerd[4432]: time="2024-05-13T20:38:59.787040271Z" level=info msg="containerd successfully booted in 0.021146s"
May 13 20:38:59 wk01.kifarunix.com systemd[1]: Started containerd.service - containerd container runtime.
Once you have confirmed that, get the cluster join command that was output during cluster boot strapping and execute on each node.
Note that this command is displayed after initializing the control plane above and it should be executed as root user.
sudo kubeadm join 192.168.122.60:6443 \
--token 2ntlip.lsw3yriy62bs16pp \
--discovery-token-ca-cert-hash sha256:a22e5b78b50c54af7de5390ec804b311d28ea40048d9c6b66ee21660bbe4d212
If you didn’t save the Kubernetes Cluster joining command, you can at any given time print using the command below on the Master or control plane;
kubeadm token create --print-join-command
Once the command runs, you will get an output similar to below;
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-check] Waiting for a healthy kubelet. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 501.461586ms
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
On the Kubernetes control plane (master, as the regular user with which you created the cluster as), run the command below to verify that the nodes have joined the cluster.
kubectl get nodes
NAME STATUS ROLES AGE VERSION
master.kifarunix.com Ready control-plane 20h v1.30.0
wk01.kifarunix.com Ready 107s v1.30.0
wk02.kifarunix.com NotReady 11s v1.30.0
wk03.kifarunix.com NotReady 6s v1.30.0
There are different node stati;
- NotReady: The node has been added to the cluster but is not yet ready to accept workloads.
- SchedulingDisabled: The node is not able to receive new workloads because it is marked as unschedulable.
- Ready: The node is ready to accept workloads.
- OutOfDisk: Indicates that the node is running out of disk space.
- MemoryPressure: Indicates that the node is running out of memory.
- PIDPressure: indicates that there are too many processes on the node
- DiskPressure: Indicates that the node is running out of disk space.
- NetworkUnavailable: Indicates that the node is not reachable via the network.
- Unschedulable: Indicates that the node is not schedulable for new workloads.
- ConditionUnknown: Indicates that the node status is unknown due to an error.
Role of the Worker nodes may show up as <none>
. This is okay. No role is assigned to the node by default. It is only until the control plane assign a workload on the node then it shows up the correct role.
You can however update this ROLE using the command;
kubectl label node <worker-node-name> node-role.kubernetes.io/worker=true
Get Kubernetes Cluster Information
As you can see, we now have a cluster. Run the command below to get cluster information.
kubectl cluster-info
Kubernetes control plane is running at https://192.168.122.60:6443
CoreDNS is running at https://192.168.122.60:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
List Kubernetes Cluster API Resources
You can list all Kubernetes cluster resources using the command below;
kubectl api-resources
NAME SHORTNAMES APIVERSION NAMESPACED KIND
bindings v1 true Binding
componentstatuses cs v1 false ComponentStatus
configmaps cm v1 true ConfigMap
endpoints ep v1 true Endpoints
events ev v1 true Event
limitranges limits v1 true LimitRange
namespaces ns v1 false Namespace
nodes no v1 false Node
persistentvolumeclaims pvc v1 true PersistentVolumeClaim
persistentvolumes pv v1 false PersistentVolume
pods po v1 true Pod
podtemplates v1 true PodTemplate
replicationcontrollers rc v1 true ReplicationController
resourcequotas quota v1 true ResourceQuota
secrets v1 true Secret
serviceaccounts sa v1 true ServiceAccount
services svc v1 true Service
mutatingwebhookconfigurations admissionregistration.k8s.io/v1 false MutatingWebhookConfiguration
validatingadmissionpolicies admissionregistration.k8s.io/v1 false ValidatingAdmissionPolicy
validatingadmissionpolicybindings admissionregistration.k8s.io/v1 false ValidatingAdmissionPolicyBinding
validatingwebhookconfigurations admissionregistration.k8s.io/v1 false ValidatingWebhookConfiguration
customresourcedefinitions crd,crds apiextensions.k8s.io/v1 false CustomResourceDefinition
apiservices apiregistration.k8s.io/v1 false APIService
controllerrevisions apps/v1 true ControllerRevision
daemonsets ds apps/v1 true DaemonSet
deployments deploy apps/v1 true Deployment
replicasets rs apps/v1 true ReplicaSet
statefulsets sts apps/v1 true StatefulSet
selfsubjectreviews authentication.k8s.io/v1 false SelfSubjectReview
tokenreviews authentication.k8s.io/v1 false TokenReview
localsubjectaccessreviews authorization.k8s.io/v1 true LocalSubjectAccessReview
selfsubjectaccessreviews authorization.k8s.io/v1 false SelfSubjectAccessReview
selfsubjectrulesreviews authorization.k8s.io/v1 false SelfSubjectRulesReview
subjectaccessreviews authorization.k8s.io/v1 false SubjectAccessReview
horizontalpodautoscalers hpa autoscaling/v2 true HorizontalPodAutoscaler
cronjobs cj batch/v1 true CronJob
jobs batch/v1 true Job
certificatesigningrequests csr certificates.k8s.io/v1 false CertificateSigningRequest
leases coordination.k8s.io/v1 true Lease
bgpconfigurations crd.projectcalico.org/v1 false BGPConfiguration
bgpfilters crd.projectcalico.org/v1 false BGPFilter
bgppeers crd.projectcalico.org/v1 false BGPPeer
blockaffinities crd.projectcalico.org/v1 false BlockAffinity
caliconodestatuses crd.projectcalico.org/v1 false CalicoNodeStatus
clusterinformations crd.projectcalico.org/v1 false ClusterInformation
felixconfigurations crd.projectcalico.org/v1 false FelixConfiguration
globalnetworkpolicies crd.projectcalico.org/v1 false GlobalNetworkPolicy
globalnetworksets crd.projectcalico.org/v1 false GlobalNetworkSet
hostendpoints crd.projectcalico.org/v1 false HostEndpoint
ipamblocks crd.projectcalico.org/v1 false IPAMBlock
ipamconfigs crd.projectcalico.org/v1 false IPAMConfig
ipamhandles crd.projectcalico.org/v1 false IPAMHandle
ippools crd.projectcalico.org/v1 false IPPool
ipreservations crd.projectcalico.org/v1 false IPReservation
kubecontrollersconfigurations crd.projectcalico.org/v1 false KubeControllersConfiguration
networkpolicies crd.projectcalico.org/v1 true NetworkPolicy
networksets crd.projectcalico.org/v1 true NetworkSet
endpointslices discovery.k8s.io/v1 true EndpointSlice
events ev events.k8s.io/v1 true Event
flowschemas flowcontrol.apiserver.k8s.io/v1 false FlowSchema
prioritylevelconfigurations flowcontrol.apiserver.k8s.io/v1 false PriorityLevelConfiguration
ingressclasses networking.k8s.io/v1 false IngressClass
ingresses ing networking.k8s.io/v1 true Ingress
networkpolicies netpol networking.k8s.io/v1 true NetworkPolicy
runtimeclasses node.k8s.io/v1 false RuntimeClass
apiservers operator.tigera.io/v1 false APIServer
imagesets operator.tigera.io/v1 false ImageSet
installations operator.tigera.io/v1 false Installation
tigerastatuses operator.tigera.io/v1 false TigeraStatus
poddisruptionbudgets pdb policy/v1 true PodDisruptionBudget
clusterrolebindings rbac.authorization.k8s.io/v1 false ClusterRoleBinding
clusterroles rbac.authorization.k8s.io/v1 false ClusterRole
rolebindings rbac.authorization.k8s.io/v1 true RoleBinding
roles rbac.authorization.k8s.io/v1 true Role
priorityclasses pc scheduling.k8s.io/v1 false PriorityClass
csidrivers storage.k8s.io/v1 false CSIDriver
csinodes storage.k8s.io/v1 false CSINode
csistoragecapacities storage.k8s.io/v1 true CSIStorageCapacity
storageclasses sc storage.k8s.io/v1 false StorageClass
volumeattachments storage.k8s.io/v1 false VolumeAttachment
You are now ready to deploy an application on Kubernetes cluster.
Want to dive deeper in getting Kubernetes up and running? Check out this book, Kubernetes: Up and Running: Dive into the Future of Infrastructure 3rd Edition by Brendan Burns.
Remove Worker Nodes from Cluster
You can gracefully remove a node from Kubernetes cluster as described in the guide below;
Gracefully Remove Worker Node from Kubernetes Cluster
AppArmor Blocks runc Signals, Pods Stuck Terminating
You might have realized that in the recent version of Ubuntu, there is an issue whereby draining the nodes or deleting the pods get stuck with such errors in apparmor logs as;
2024-06-14T19:04:43.331091+00:00 worker-01 kernel: audit: type=1400 audit(1718391883.329:221): apparmor="DENIED" operation="signal" class="signal" profile="cri-containerd.apparmor.d" pid=7445 comm="runc" requested_mask="receive" denied_mask="receive" signal=kill peer="runc
This is a bug on AppArmor profile that denies signals from runc. This results in many pods being stuck in a terminating state. The bug was reported by Sebastian Podjasek on 2024-05-10. It affects Ubuntu containerd-app package.
Read how to fix on kubectl drain node gets stuck forever [Apparmor Bug]
Install Kubernetes Dashboard
You can manage your cluster from the dashboard using Kubernetes. See the guide below;
How to Install Kubernetes Dashboard
Great article, thanks. Helped me save my deployment on Debian 12 Bookworm!
How did you find out about all these necessary changes to the containerd Configuration, etc.?
Where are the sources for that?
I would like to know, how to get these specific information about such small details in the configuration.
Glad it helped! All info is on k8s documentation plus a bit of google digging when faced an error!