Gracefully Remove Worker Node from Kubernetes Cluster

|
Last Updated:
|
|
Gracefully Remove Worker Node from Kubernetes Cluster

In this tutorial, you will learn how to gracefully remove worker node from Kubernetes cluster. If you are managing a Kubernetes cluster, there might arise a need to remove a worker node from it. The process of removing a worker node from a Kubernetes cluster may slightly vary with the tool or platform that is used to deploy Kubernetes cluster. For example, the process of removing a worker node from a kubeadm Kubernetes cluster might be different from removing a worker node from a Kubernetes deployed using GKE, Rancher, OpenShift, Kubermatic, or kops.

Gracefully Remove Worker Node from Kubernetes Cluster

In this guide, we will learn how to safely or gracefully remove a worker node from a Kubernetes cluster created with kubeadm.

So, what are the steps to gracefully remove worker node from Kubernetes cluster created using kubeadm?

Prepare the Node for Removal

In a production environment, you can’t just wake up and decide to remove a node that is running production workloads. You need to prepare for the removal in order to minimize the risk of downtime or performance issues in the Kubernetes.

Preparing the node for removal from Kubernetes cluster involves various tasks such as;

  • Ensure that there is sufficient resources in the cluster to support the cluster workloads
  • Ensure that services hosted by specific node being removed are distributed across Kubernetes cluster nodes.
  • Backup any data and configurations on the node being removed from the cluster.
  • Ensure that removal of the node doesn’t affect any networking settings on the cluster.

Drain the Worker Node

Before you can remove a worker node from Kubernetes cluster, you need to migrate all the Pods scheduled on that node to other nodes in the cluster. This is what is called draining a node in Kubernetes and is one of the Kubernetes workloads voluntary disruption action. This will ensure that the workloads runs uninterrupted during the decommissioning of the node.

kubectl drain is the command that can be used to migrate the nodes running on the node marked for decommissioning into other cluster nodes.

The command needs to be executed on the Kubernetes control plane/master node.

The syntax of the kubectl drain command is;

kubectl drain NODE [options]

List the nodes on the cluster;

kubectl get nodes

Sample output;


NAME                        STATUS   ROLES           AGE     VERSION
master.kifarunix-demo.com   Ready    control-plane   6d16h   v1.27.1
wk01.kifarunix-demo.com     Ready    worker          6d16h   v1.27.1
wk02.kifarunix-demo.com     Ready    <none>          6d15h   v1.27.1
wk03.kifarunix-demo.com     Ready    <none>          6d15h   v1.27.1

You can then run the command below to drain the worker node marked for decommissioning.

For example, let’s drain wk01 in example output above;

kubectl drain wk01.kifarunix-demo.com

The ‘drain’ command waits for graceful termination. You should not operate on the machine until the command completes.

The drain command will also cordone the node. This means that the node is marked as unschedulable and prevents the Kubernetes scheduler from placing any new pods onto the node.

If the node has daemon set-managed pods, the drain command wont drain it. Similarly, if there are any pods that are neither mirror pods nor managed by a replication controller, replica set, daemon set, stateful set, or job, then drain will not delete any pods unless you use --force. The --force will also allow deletion to proceed if the managing resource of one or more pods is missing.

Sample output in my drain command above;


node/wk01.kifarunix-demo.com cordoned
error: unable to drain node "wk01.kifarunix-demo.com" due to error:cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore): calico-system/calico-node-g4vlv, calico-system/csi-node-driver-9b9xx, kube-system/kube-proxy-sn66b, continuing command...
There are pending nodes to be drained:
 wk01.kifarunix-demo.com
cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore): calico-system/calico-node-g4vlv, calico-system/csi-node-driver-9b9xx, kube-system/kube-proxy-sn66b

To forcefully drain the node, i just add the --force, --ignore-daemonsets options to the command.

kubectl drain wk01.kifarunix-demo.com --force --ignore-daemonsets

Use these flags with caution as it can lead to disruption of running pods

As already mentioned, let the command terminates gracefully.

Note that this process may take sometime to complete.

Check more options using the --help option.

kubectl drain --help

Once the command runs, you can check nodes status to ensure that it is now not read to accept any workloads.

There are also a number of constraints that may affect the drain process of a node;

  • Pod Disruption Budgets (PDBs): This is a Kubernetes feature that sets a minimum number of replicas a given pod must have during voluntary disruption actions such during node maintenance, updates, etc. Draining a node a cannot violate the PDBs, as this could result in a service disruption. And thus, your drain command may just hang and does nothing. You can check for any PDBs on all namespaces;
    kubectl get poddisruptionbudget --all-namespaces.
  • Resource Constraints: The cluster must have enough resources to accommodate the Pods being evicted from the node being drained.
  • Node Affinity/Anti-affinity. These are some of the Kubernetes Pod scheduler decision making concepts. Affinity defines conditions under which a Pod is scheduled on a node while anti-affinity define conditions under which a Pod should not be scheduled on a node. Drain process respects whatever conditions defined by the scheduler
  • Daemonsets: In Kubernetes, daemonsets ensure that a pod is running on every node in the cluster. Before a node is successfully drained, the pods running on it must be rescheduled elsewhere. Drain constraints must account for daemonsets to avoid service disruption.
  • Taints/Tolerations: Taints is the opposite of Node affinity. They are defined to prevent certain pods from being scheduled on specific nodes. Tolerations on the other hand are applied to Pods to allow them to be scheduled on nodes with matching taints. The drain command will only evict Pods from tainted nodes if they have tolerations that match.

Delete a Worker Node

Once the node has been successfully evicted, you can now remove it from the cluster.

You can remove an evicted node from the cluster using the kubectl delete command.

kubectl delete node wk01.kifarunix-demo.com

Confirm Removal of Worker Node

You can verify that the node has been removed using the kubectl get nodes command.

kubectl get nodes

NAME                        STATUS   ROLES           AGE     VERSION
master.kifarunix-demo.com   Ready    control-plane   6d21h   v1.27.1
wk02.kifarunix-demo.com     Ready    <none>          6d21h   v1.27.1
wk03.kifarunix-demo.com     Ready    <none>          6d21h   v1.27.1

wk01 is gone!

You can also check Pods;

kubectl get pod --all-namespaces

To list Pods running on a specific node;

kubectl get pods --field-selector spec.nodeName=<node-name>

E.g;

kubectl get pods --field-selector spec.nodeName=wk01.kifarunix-demo.com

You should have now successfully and gracefully removed a node from the kubeadm Kubernetes cluster.

Optionally Reset the Node

You can optionally remove all the configurations related to K8s on the node.

Thus, login to the node that was drained and run the command below;

kubeadm reset

Conclusion

In summary, the steps you need to take gracefully remove worker node from cluster;

  • ensure cluster is ready to support workload after removal of a specific node
  • drain the node to from the cluster (kubectl drain <node-name>)
  • fix any constraints that might prevent the removal of a node
  • delete the node from the cluster

That marks the end of our guide on how to gracefully remove worker node from Kubernetes cluster.

Other Tutorials

What are the core concepts in Kubernetes?

Kubernetes Architecture: A High-level Overview of Kubernetes Cluster Components

SUPPORT US VIA A VIRTUAL CUP OF COFFEE

We're passionate about sharing our knowledge and experiences with you through our blog. If you appreciate our efforts, consider buying us a virtual coffee. Your support keeps us motivated and enables us to continually improve, ensuring that we can provide you with the best content possible. Thank you for being a coffee-fueled champion of our work!

Photo of author
Kifarunix
Linux Certified Engineer, with a passion for open-source technology and a strong understanding of Linux systems. With experience in system administration, troubleshooting, and automation, I am skilled in maintaining and optimizing Linux infrastructure.

Leave a Comment