
In this tutorial, you will learn on how to extend OpenShift CoreOS /sysroot root filesystem. Extending the OpenShift CoreOS /sysroot filesystem is a critical task for administrators running OpenShift clusters. OpenShift CoreOS (RHCOS) uses an immutable root filesystem (/sysroot), which can pose challenges when additional storage is needed for applications, logs, or system updates. This guide provides a step-by-step process to safely extend the /sysroot filesystem in a KVM-based virtualized test OpenShift cluster environment, ensuring minimal downtime and data integrity.
Table of Contents
Extend OpenShift CoreOS /sysroot Root Filesystem
Why Extend the /sysroot Filesystem?
Running OpenShift in a test environment offers flexibility for experimenting with virtualized clusters, but the default disk size allocated to RHCOS VMs may become insufficient. Common scenarios requiring filesystem extension include:
- Logs: Growing application or system logs.
- Workloads: Additional containers or services.
- Updates: CoreOS updates needing temporary storage.
In KVM test environments, extending the /sysroot filesystem involves resizing the virtual disk, updating the partition table, and expanding the logical volume. This process must account for RHCOS’s immutable nature and KVM’s virtual disk management. For bare-metal deployments, consult Red Hat support to ensure compatibility and safety.
Identifying Signs of Low Disk Space on Nodes
Before it gets to a point where there is a need to extend /sysroot
, it’s critical to detect when disk space on nodes is running low. Common signs include:
- Service Failures: Pods or services fail to start or take too long to schedule, stays in ContainerCreating or Pending state…Sample logs
oc describe pod <pod-name>
Status: Pending
SeccompProfile: RuntimeDefault
IP:
IPs: <none>
Controlled By: ReplicaSet/mariadb-7d794d9ccd
Containers:
mariadb:
Image: image-registry.openshift-image-registry.svc:5000/openshift/mariadb@sha256:b11ca823cfb0ef506cd3ff5d0d76eea6b23d61ab254e00bf0cc9dea3e0954795
Port: 3306/TCP
Host Port: 0/TCP
Environment:
MYSQL_ROOT_PASSWORD: pass
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-m4675 (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
kube-api-access-m4675:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
ConfigMapName: openshift-service-ca.crt
ConfigMapOptional: <nil>
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 9m59s default-scheduler 0/6 nodes are available: 1 Too many pods, 2 node(s) had untolerated taint {node.kubernetes.io/disk-pressure: }, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/6 nodes are available: 1 No preemption victims found for incoming pod, 5 Preemption is not helpful for scheduling.
Warning FailedScheduling 72s (x2 over 6m13s) default-scheduler 0/6 nodes are available: 1 node(s) had untolerated taint {node.kubernetes.io/unreachable: }, 2 node(s) had untolerated taint {node.kubernetes.io/disk-pressure: }, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling.
- Node Disk Pressure in Logs: Check OpenShift logs for disk pressure events indicating low /sysroot space. Sample logs
oc describe node <node-name>
Sample logs;
...
Taints: node.kubernetes.io/unreachable:NoExecute
node.kubernetes.io/unreachable:NoSchedule
Unschedulable: false
Lease:
HolderIdentity: k8s-wk-03.ocp.kifarunix-demo.com
AcquireTime:
RenewTime: Wed, 14 May 2025 18:24:44 +0000
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
MemoryPressure Unknown Wed, 14 May 2025 18:24:57 +0000 Wed, 14 May 2025 18:25:44 +0000 NodeStatusUnknown Kubelet stopped posting node status.
DiskPressure Unknown Wed, 14 May 2025 18:24:57 +0000 Wed, 14 May 2025 18:25:44 +0000 NodeStatusUnknown Kubelet stopped posting node status.
PIDPressure Unknown Wed, 14 May 2025 18:24:57 +0000 Wed, 14 May 2025 18:25:44 +0000 NodeStatusUnknown Kubelet stopped posting node status.
Ready Unknown Wed, 14 May 2025 18:24:57 +0000 Wed, 14 May 2025 18:25:44 +0000 NodeStatusUnknown Kubelet stopped posting node status.
Addresses:
InternalIP: 192.168.122.215
Hostname: k8s-wk-03.ocp.kifarunix-demo.com
...
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 745m (21%) 210m (6%)
memory 3649Mi (33%) 8Gi (75%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning EvictionThresholdMet 50m kubelet Attempting to reclaim ephemeral-storage
Normal NodeHasDiskPressure 50m kubelet Node k8s-wk-03.ocp.kifarunix-demo.com status is now: NodeHasDiskPressure
Normal NodeNotReady 48m node-controller Node k8s-wk-03.ocp.kifarunix-demo.com status is now: NodeNotReady
- Debugging Issues: Inability to debug nodes due to a full /sysroot. Commands like oc debug node/<node-name> fail, reporting insufficient space.
- Node Info Metrics: Use OpenShift’s monitoring tools (e.g., Prometheus) to track disk availability. You can query the same on OpenShift web under Observer > Metrics and use the query:
node_filesystem_avail_bytes{mountpoint="/sysroot"}
Proactively monitor these signs to prevent cluster disruptions. Configure alerts in OpenShift’s monitoring stack for low disk space thresholds on /sysroot.
Prerequisites
Before proceeding in a test environment, ensure you have:
- Administrative access to the KVM hypervisor and OpenShift cluster.
- A backup of the VM’s disk image to prevent data loss. (A must)
- Familiarity with Linux storage concepts (LVM, partitions, filesystems).
- Tools like virt-resize, parted, and lvm installed on the KVM host.
- The oc command-line tool for OpenShift management.
Warning: Test this in a non-production environment first. Production changes require extensive validation.
Important: Kindly note that it is required that the OpenShift Container Platform worker nodes MUST have the the same storage type and size attached to each node. As such, if you are extending for one node, it must be expanded for the rest of the worker nodes!
Step-by-Step Guide to Extend /sysroot Filesystem
1. Do We Need to Cordon/Drain the Affected Node?
In this guide, we will use online KVM disk expansion using virsh blockresize at the hypervisor level and growpart on the node itself to extend the drive. As such, cordon/drain is not strictly required when performing a disk resize using these tools on KVM-based node.
- The
virsh blockresize
command resizes the virtual disk at the hypervisor level without interrupting disk I/O. - Inside the VM,
growpart
and filesystem resizing (resize2fs
orxfs_growfs
) are also safe to run on mounted filesystems — including the root filesystem. - These operations are non-disruptive and do not require unmounting or rebooting the node.
- Perform the operation during a maintenance window or low activity period
- Ensure recent backups or snapshots are available
- Monitor disk I/O closely if the node runs critical workloads
2. Back Up the Virtual Disk
Create a clone, snapshot or backup of the VM’s disk image (e.g., QCOW2) if you have enough space on the hypervisor host.
3. Resize the Virtual Disk Live
We are doing an online disk resize which do not need the VM to be shutdown.
For KVM users, this has been extensively illustrated in our previous guide and therefore, these are the logical steps you need to take;
- Step 1. Identify the Running VM
- Step 2. Identify the Target Disk Device
- Step 3. Resize the Virtual Disk Online
- Step 4. Verify the Resize Operation
For example, this was the size of my one of the worker nodes before expanding on the hypervisor host.
sudo virsh domblkinfo ocp-node-wk-01 vda --human
Sample command output
Capacity: 53687091200
Allocation: 53689724928
Physical: 53689450496
Note that all the three nodes had the same storage.
After extending the vm disk size, this is our new size (for all the three worker nodes);
sudo virsh domblkinfo ocp-node-wk-03 vda --human
Capacity: 100.000 GiB
Allocation: 49.986 GiB
Physical: 49.985 GiB
Verify the new size inside the VM (Replace ssh-key and worker-node with your SSH key and respective worker node address):
ssh -i ssh-key core@worker-node lsblk
Sample command output;
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
vda 252:0 0 100G 0 disk
├─vda1 252:1 0 1M 0 part
├─vda2 252:2 0 127M 0 part
├─vda3 252:3 0 384M 0 part /boot
└─vda4 252:4 0 49.5G 0 part /var/opt/pwx/oci
/var
/sysroot/ostree/deploy/rhcos/var
/usr
/etc
/
/sysroot
As you can, the drive is now at 100G yet the root partition still shows the old size before it was expanded.
See, the current state:
df -hT
Sample command output;
[core@k8s-wk-03 ~]$ df -hT
Filesystem Type Size Used Avail Use% Mounted on
devtmpfs devtmpfs 4.0M 0 4.0M 0% /dev
tmpfs tmpfs 5.9G 92K 5.9G 1% /dev/shm
tmpfs tmpfs 2.4G 74M 2.3G 4% /run
/dev/vda4 xfs 50G 31G 19G 62% /sysroot
tmpfs tmpfs 5.9G 4.0K 5.9G 1% /tmp
/dev/vda3 ext4 350M 112M 216M 35% /boot
tmpfs tmpfs 64M 0 64M 0% /var/lib/osd/lttng
tmpfs tmpfs 1.2G 0 1.2G 0% /run/user/1000
So, proceed to step 5!
Step 5. Final Step: Expand the /sysroot Filesystem Inside the VM:
So, login to each VM and expand the /sysroot filesystem.
As you can see from the output of the df -hT command, /sysroot is on the forth partition of drive vda, /dev/vda4. Hence, use growpart command to extend partition table so that partition takes up all the space it can. Replace your drive and drive partition number accordingly.
sudo growpart vda 4
Sample command output;
CHANGED: partition=4 start=1050624 old: size=103806943 end=104857566 new: size=208664543 end=209715166
Do the same on all the respective nodes.
Check the partition to verify the resize;
lsblk
Sample output;
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
vda 252:0 0 100G 0 disk
├─vda1 252:1 0 1M 0 part
├─vda2 252:2 0 127M 0 part
├─vda3 252:3 0 384M 0 part /boot
└─vda4 252:4 0 99.5G 0 part /var/lib/kubelet/pods/bc3243dd-6ece-467f-8517-bd3e7dbe8c86/volume-subpaths/nginx-conf/monitoring-plugin/1
/var/lib/kubelet/pods/e5eb4a59-6f8e-4b59-b216-422378e7d91f/volume-subpaths/nginx-conf/networking-console-plugin/1
/var/opt/pwx/oci
/var
/sysroot/ostree/deploy/rhcos/var
/usr
/etc
/
/sysroot
Now proceed to expand the filesystem. Note that the /sysroot partition is an XFS filesystem. Check the output of the df -hT.
To extend an XFS filesystem, use xfs_growfs command against the mount point of the respective partition.
But, here comes another issue! The RH CoreOS mounts /sysroot filesystem in read-only mode by default. Which means that, write operations like disk expansion will not work, unless we have some write access to the drive.
If you are logged into the node, you can confirm this by running the command below.
mount | grep "on /sysroot "
Sample command output;
/dev/vda4 on /sysroot type xfs (ro,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,prjquota)
As you can see, ro (Read Only) option is shown as one of the mount options.
So, how do we proceed?
Well, you have to remount the /sysroot in read-write mode, to be able to expand. As such, you need to create a separate mount namespace isolated from the rest of the system that lets you make temporary mount changes (like remounting /sysroot
) without affecting the global system. This can be achieved using the unshare command.
For example, run the unshare command with –mount option to create a separate mount namespace.
sudo unshare --mount
Once you are in the new mount namespace, remount the /sysroot in RW mode.
mount -o remount,rw /sysroot
You should now be able to expand the /sysroot filesystem by running the command below;
xfs_growfs /sysroot
Sample command output;
meta-data=/dev/vda4 isize=512 agcount=68, agsize=191744 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=1, rmapbt=0
= reflink=1 bigtime=1 inobtcount=1 nrext64=0
data = bsize=4096 blocks=12975867, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=1
log =internal log bsize=4096 blocks=16384, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
data blocks changed from 12975867 to 26083067
Once the resize is complete, exit the new mount point:
exit
Then verify the same with df -hT command;
df -hT
Sample output;
Filesystem Type Size Used Avail Use% Mounted on
devtmpfs devtmpfs 4.0M 0 4.0M 0% /dev
tmpfs tmpfs 5.9G 92K 5.9G 1% /dev/shm
tmpfs tmpfs 2.4G 74M 2.3G 4% /run
/dev/vda4 xfs 100G 31G 70G 31% /sysroot
tmpfs tmpfs 5.9G 4.0K 5.9G 1% /tmp
/dev/vda3 ext4 350M 112M 216M 35% /boot
tmpfs tmpfs 64M 0 64M 0% /var/lib/osd/lttng
tmpfs tmpfs 1.2G 0 1.2G 0% /run/user/1000
As you can see, the /sysroot is expanded successfully.
Confirm the node is ready:
oc get nodes
Sample nodes output;
NAME STATUS ROLES AGE VERSION
k8s-ms-01.ocp.kifarunix-demo.com Ready control-plane,master 87d v1.30.7
k8s-ms-02.ocp.kifarunix-demo.com Ready control-plane,master 87d v1.30.7
k8s-ms-03.ocp.kifarunix-demo.com Ready control-plane,master 87d v1.30.7
k8s-wk-01.ocp.kifarunix-demo.com Ready worker 87d v1.30.7
k8s-wk-02.ocp.kifarunix-demo.com Ready worker 87d v1.30.7
k8s-wk-03.ocp.kifarunix-demo.com Ready worker 87d v1.30.7
Confirm that the nodes no longer have disk pressure.
oc get nodes <node name> -o yaml
Sample output;
...
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
MemoryPressure False Wed, 14 May 2025 20:54:38 +0000 Thu, 08 May 2025 14:15:52 +0000 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Wed, 14 May 2025 20:54:38 +0000 Wed, 14 May 2025 20:52:46 +0000 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Wed, 14 May 2025 20:54:38 +0000 Thu, 08 May 2025 14:15:52 +0000 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Wed, 14 May 2025 20:54:38 +0000 Thu, 08 May 2025 14:17:27 +0000 KubeletReady kubelet is posting ready status
Addresses:
InternalIP: 192.168.122.214
Hostname: k8s-wk-02.ocp.kifarunix-demo.com
...
And you have successfully expanded the Red Hat OpenShift CoreOS /sysroot filesystem.
Alternatives to Disk Expansion
What if there were other methods to recover disk space on Red Hat OpenShift CoreOS nodes? Well, before extending /sysroot, consider housekeeping to reclaim space:
- Evict Unused Pods: Identify and delete idle, terminated, completed pods that were used to do one off jobs:
oc delete pod `oc get pods -A | grep -E "Terminat|Completed|Error|Crash|StatusUnknown" | awk '{print $1}'`
- Clean Up Orphaned or Old Images: Remove unused container images. You can manually prune the images from the node or utilize the Cluster Image Registry Operator for automatic pruning.
For manual image pruning, debug into the node;
oc debug node/<node-name> -- chroot /host crictl rmi --prune
You can also use oc adm prune command to prune the images. You will need a user token to use the command. The user whose token you want to use must have system:image-pruner
cluster role or greater.
OpenShift also ships with ImagePruner resource called cluster that helps with automatic image prunning.
oc get imagepruner
Read more on basic openshift pruning operations.
- Clear Logs: Truncate or rotate large log files. For example, to truncate all log files that are larger than 100M of size;
oc debug node/<node-name> -- chroot /host find /var/log -size +100M -exec truncate -s 0 {} \;
And many other housekeeping tasks.
These steps can delay or avoid disk expansion, especially in test environments. Monitor their impact to ensure sufficient space is reclaimed.
Conclusion
Extending the OpenShift CoreOS /sysroot filesystem with minimal downtime is achievable using live resizing tools like virsh blockresize
in KVM. Similarly, housekeeping alternatives like evicting unused pods and cleaning old images can further optimize space. For production, validate all the steps thoroughly. This ensures reliable containerized workloads in your cluster.