Managing Block Storage for OpenShift with OpenShift Data Foundation

Persistent storage has long been challenging for stateful applications running in containerized environments like OpenShift. While containers offer flexibility and scalability, their ephemeral nature can lead to data loss if they are not properly managed.

Unlike traditional systems, container storage is temporary, meaning that data created within a container can be lost. Many applications, especially those that need to maintain data across restarts (like databases) require permanent storage solutions. However, the storage landscape for containerized workloads has evolved significantly, and OpenShift now offers a variety of solutions to address this critical requirement.

In this article, we discuss the key concepts of OpenShift block storage and learn the steps of setting up OpenShift Data Foundation (ODF) services in an OpenShift cluster.

Summary of key concepts of block storage for OpenShift

The following table provides an overview of the key concepts covered in this article.

Concept	Description
Block storage in OpenShift	Provides durable storage volumes that preserve application data, even if containers are restarted or relocated.
Persistent storage in OpenShift	OpenShift provides persistent storage capabilities to handle data that needs to survive even if an application pod is restarted or replaced.
Dynamic volume provisioning	When more storage is needed, OpenShift leverages Container Storage Interface (CSI) drivers to automatically provision storage from various providers for simplified storage management and scalability.
OpenShift Data Foundation	A software-defined storage solution specifically designed for OpenShift, offering advanced features like high availability and scalability for containerized environments.

Block storage in OpenShift

Block storage provides access to raw block devices for application storage. These block devices function as independent storage volumes, similar to the physical drives found in servers, and typically require formatting and mounting for application access. Each block device is treated as an independent disk drive and can support an individual file system.

Block storage is ideal when applications require faster access to optimize computationally heavy data workloads. Block-level access to storage volumes is a common approach for databases, server-side processing, and high-performance data access applications. Use block storage if the containerized workload requires fast and reliable data access.

OpenShift can use locally attached drives or storage provisioned from SAN arrays for block storage, which doesn’t directly interact with physical storage. It relies on an abstraction layer to manage and provision storage, allowing the application to decouple from the underlying storage infrastructure.

OpenShift uses block storage for both persistent and temporary storage needs. For persistent storage, OpenShift can utilize underlying storage solutions to provision volumes that persist beyond the lifecycle of pods. These volumes are ideal for stateful applications that require data durability.

For applications that do not require data to persist, OpenShift provides temporary block storage directly from the local storage of the nodes hosting the pods. This ephemeral storage is suitable for temporary data that does not need to be preserved after the pod terminates.

Understanding ephemeral and persistent block storage in OpenShift

Storage in the OpenShift platform can be broadly classified into two categories: ephemeral and persistent. Ephemeral storage is transient in nature and designed for stateless applications. Stateful applications require persistent storage to persist their data independently of the pod’s lifecycle.

Understanding ephemeral storage

Some applications require storage but don’t need the data to persist after they stop; ephemeral volumes are well suited for these scenarios. They are created and deleted with the pod, so pods can be restarted anywhere without relying on specific persistent storage. Containerized workloads use this storage for temporary files, caching, and logs. However, this approach can lead to several issues:

Unknown storage capacity: Pods can’t determine how much temporary storage is available.
No guaranteed storage: Pods can’t request a specific amount of temporary storage; it is allocated on a first-come, first-served basis.
Eviction risk: Pods might be removed if they use too much temporary storage, and new pods can’t start until space is freed up.
Lack of suitability for stateful applications: Stateful workloads like databases that require data persistence across restarts cannot use ephemeral storage.

Understanding persistent storage

OpenShift makes storage management and consumption easy for cluster administrators and stateful containerized workloads. It uses the persistent volume framework to allocate storage resources. The PersistentVolume API simplifies persistent storage management by hiding complex details, allowing users and administrators to interact with storage without worrying about the underlying implementation.

There are two relevant resources here:

Persistent volume claims (PVCs): The workloads in OpenShift clusters express their storage requirements through PVCs. A PVC specifies the storage size and access mode needed and optionally requests a specific storage class.
Persistent volumes (PVs): Persistent volumes represent the physical storage resources available in the cluster. PVs can utilize block storage protocols (such as Fiber Channel and iSCSI), file storage protocols (such as NFS), or specific storage systems offered by storage array vendors and cloud providers.

Administrators set up storage resources by creating PVs, while developers request those resources for their workloads via PVCs. This enables developers to focus on their applications, not storage details. OpenShift’s approach ensures efficient storage utilization and flexible allocation within the cluster.

Automated Red Hat OpenShift Data Protection & Intelligent Recovery

Perform secure application-centric backups of containers, VMs, helm & operators

Use pre-staged snapshots to instantly test, transform, and restore during recovery

Scale with fully automated policy-driven backup-and-restore workflows

Dynamic volume provisioning with storage classes and CSI drivers

The manual provisioning of persistent volumes and persistent volume claims can be a tedious and error-prone process. It requires precise matching of storage resources to application needs, which can be challenging to predict in advance. Dynamic provisioning can mitigate some of these issues by automating the provisioning of persistent volumes.

Storage classes dynamically provision storage resources based on applications’ needs. They allow administrators to describe the classes of storage they offer by specifying factors like provisioners, access modes, and quality of service. This allows for more granular control over the allocation of persistent volumes and ensures that workloads receive the appropriate storage resources for their specific needs. When creating a PVC, workloads can specify a desired storage class based on requirements.

Storage classes provide a versatile way to define and work with various types of storage in your environment. However, they can be limited when working with enterprise storage arrays. The Container Storage Interface (CSI) helps overcome these challenges by establishing a standard interface that allows different storage systems to connect with OpenShift. The CSI interface decouples the storage systems from OpenShift, which makes it easier to integrate storage from a wide range of storage providers, such as traditional storage arrays, cloud platforms, and object storage. This design allows cluster administrators to select the most suitable storage options for their containerized workloads without facing any limitations imposed by default storage plugins.

OpenShift can use the Container Storage Interface to consume storage from storage backends that implement the CSI interface as persistent storage. CSI acts as a plugin between OpenShift and the underlying storage provider. It translates storage requests (PVs and PVCs) into specific calls for the storage array that the driver manages.

OpenShift virtualization can use any supported CSI provisioner. Each storage class uses a defined provisioner to create persistent volumes. The provisioner determines the volume plugin for provisioning these volumes and converts PVC requests into CSI calls for creating PVs.

The following figure illustrates this process in detail.

Visualizing the PV, PVC, storage class, and CSI driver workflow in OpenShift

More details about persistent storage implementation in OpenShift can be referred to in its official documentation.

Simplifying block storage with OpenShift Data Foundation (ODF)

OpenShift Data Foundation is a storage solution from RedHat that simplifies persistent storage management for containerized workloads deployed in OpenShift. It offers a unified approach to file, block, and object storage for both on-premises and hybrid cloud environments. Unlike conventional storage systems that require separate drivers and operators for different storage types, ODF provides a consolidated platform that meets all persistent storage needs for the cluster.

OpenShift Data Foundation architecture

Under the hood, ODF is based on open-source technologies such as Ceph, NooBaa, and Rook:

Ceph: Ceph is a unified, distributed, and scalable software storage solution that can provide object, block, and file storage for commodity hardware.
Rook: Rook is an open-source orchestration tool for cloud-native Kubernetes storage. It provides the necessary framework for integrating Ceph storage within Kubernetes and OpenShift.
Noobaa: ODF uses the multi-cloud object gateway (MCG) service based on the NooBaa project to provide a local object service (S3 API) backed by local or cloud-native storage.

OpenShift Data Foundation (ODF) uses Ceph to provide highly available and scalable block storage. Ceph uses the underlying physical storage devices to create a virtualized pool that guarantees high availability via data replication. ODF abstracts underlying storage details that can allow file, block, or object storage claims to get provisioned out of the same raw block storage. With ODF, data durability and fault tolerance are ensured by taking advantage of the self-healing and replicating nature of Ceph underneath.

ODF provides the following types of storage:

Block storage: ODF uses Ceph’s RADOS Block Device (RBD) to create block storage volumes that can be used for high-performance and demanding workloads.
File storage: ODF utilizes CephFS, a distributed file system built on top of Ceph, to provide scalable and shared file storage as an alternative to NFS.
Object storage: ODF leverages Ceph’s RADOS Gateway (RGW) and NooBaa to provide object storage. This can be used to store and retrieve large amounts of unstructured data, such as media files and backups.

Simplified architecture of OpenShift Data Foundation (source)

The Rook operator creates and updates the CSI driver, including a provisioner for each of the two drivers—RADOS block device (RBD) and Ceph filesystem (CephFS)—and volume plugin daemons for each of the two drivers.

Learn KubeVirt & OpenShift Virtualization Backup & Recovery Best Practices

Deployment options

As shown in the figure above, ODF provides deployment flexibility so that end-users can adopt the most appropriate approach for their environment. OpenShift Data Foundation can be deployed in the following two modes:

- Internal Mode: ODF is deployed entirely within the OpenShift cluster in internal mode. This approach can use local storage devices, SAN volumes, EBS volumes, or vSphere volumes in combination with the LSO operator. This approach is practical when:
  - Cluster storage requirements are not precise.
  - There are no dedicated infrastructure nodes.
  - Creating an extra node instance, such as on bare metal servers, is difficult.

External Mode: In external deployments, ODF uses an independent Ceph Storage cluster running outside the OpenShift cluster. This approach is recommended when:
The cluster’s storage requirements are significant.
Multiple OpenShift clusters are consuming storage services from a standard external cluster.
Another dedicated team is managing the external Ceph cluster.

Setting up OpenShift Data Foundation

We’re now going to see how to set up ODF services in an OpenShift cluster. For this demo, we’re going to configure ODF services in internal mode.

The commands used in the following tutorial can also be found in our Git repo.

Prerequisites

The minimum requirements for setting up ODF services on OpenShift are as follows:

You need an OpenShift cluster with at least three worker or infrastructure nodes.
Each of the selected nodes must have at least one raw block device available for use by ODF.
The block devices must be empty and must not contain any LVM-related configurations.
If the selected OpenShift nodes are VMs on VMware, ensure that the disk.EnableUUID option is set to TRUE for each VM.
For internal mode, the cluster should have a minimum of:
- 72 GB RAM
- 24 CPU cores
- 3 physical disks

Operator installation

To install the ODF operator on your OpenShift cluster, log into the web console with an account having cluster-admin privileges and install two operators as follows:

Click Operators > OperatorHub, then type OpenShift Data Foundation in the Filter by keyword field. Click OpenShift Data Foundation from the operator results list, then click Install. Accept all the default settings from the Operator Installation page, then click Install.
Click Operators > OperatorHub, then type Local Storage in the Filter by keyword field. Click Local Storage from the operator results list, then click Install. Accept all the default settings from the Operator Installation page, then click Install.

Preparing local storage for ODF

Before using local storage disks for ODF deployment, the following tasks need to be performed using the local storage operator:

Discovery of available disks that will be used for the ODF cluster
Creation of storage class and persistent volumes

Go to Installed Operators > Local Storage > Local Volume Discoveries and create a local volume discovery operation for the desired nodes.

Learn How To Best Backup & Restore Virtual Machines Running on OpenShift

Configuring the local storage operator

Discovering local volumes on cluster nodes

The discovery operation will run on each selected node. To see more details about its results, use the oc describe command and see the list of disks and their availability status. The disks used by the cluster will have a status of NotAvailable. Check the disk to add to the ODF cluster and note its Device ID.

You can also consider using the /dev/sdX identifier for your disks, but that can result in configuration issues: The naming convention /dev/sdX is not a stable identifier for disk drives because it is assigned in the order in which drives are discovered. The discovery order can change across reboots, so using the unique identifier for drives present under the Device ID details is best, as they persist across reboots.

$ oc get localvolumediscoveryresults
NAME                               AGE
discovery-result-ocpnode-1            1h
discovery-result-ocpnode-2            1h
discovery-result-ocpnode-3            1h



$ oc describe localvolumediscoveryresults  discovery-result-ocpnode-1 |less
[...............]
 Device ID:  /dev/disk/by-id/wwn-0x6000c29c01d91ed7b7109f82c40d42e7
 Fstype:  
 Model:   Virtual disk
 Path:    /dev/sdc
 Property:   Rotational
 Serial:  6000c29c01d91ed7b7109f82c40d42e7
 Size:    107374182400
 Status:
     State: Available
[.................]

Repeat this process for other discovery results and verify the details of the available disks.

Once device details have been identified, return to the web console and go to Installed Operators > Local Storage > Local Volume and click Create Local Volume.

Creating local volumes on cluster nodes

Choose a name for the local volumes and click the drop-down menu under StorageClassDevices > devicePaths. In the Value field, paste your drives’ Device ID details. Click Add devicePaths to add details about additional drives.

Adding device details for local volume creation

Set a name for your storage class under storageClassName and set the volumeMode to Block. Leave the other options unchanged and create the local volume.

Configuring parameters for local volume creation

Once the local volume has been created successfully, you can verify that the new storage class has been created and a new persistent volume exists against each physical device.

$ oc get sc
NAME   PROVISIONER                  RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
lso   kubernetes.io/no-provisioner      Delete       WaitForFirstConsumer
  false                 1h
$ oc get pv
NAME                                    CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                            STORAGECLASS               REASON   AGE
local-pv-3cfceaf                        20Gi    RWO         Delete        Bound openshift-storage/ocs-deviceset-lso-0-data-4fgmfr   lso                                 1h
local-pv-9ccedf9f                       100Gi    RWO         Delete        Bound openshift-storage/ocs-deviceset-lso-0-data-5zz5lk   lso                                 1h
local-pv-a862d6ec                       100Gi    RWO         Delete        Bound openshift-storage/ocs-deviceset-lso-0-data-15bsz7   lso                                 1h
local-pv-b98bce49                       100Gi    RWO         Delete        Bound openshift-storage/ocs-deviceset-lso-0-data-0kmr2r   lso                                 86d
[.........................]

Once local persistent volumes have been successfully created, a storage cluster using the ODF operator can be made. Go to Installed Operators > OpenShift Data Foundation > Storage System and click Create Storage System.

Creating an ODF storage system

Select the Full deployment and Use an existing StorageClass options, and choose the recently created storage class.

Specifying the storage system deployment type

In the Capacity and nodes page, provide the necessary information and choose a value for Requested Capacity. Select the appropriate cluster nodes with the attached devices.

Selecting cluster nodes for ODF installation

You can leave the default settings as such for Security and network and Data Protection and proceed to create the storage system.

Configuring storage system parameters

Monitor the progress of the storage cluster, pods, and PVCs in the openshift-storage project. It takes a few minutes for all of the resources to be ready.

$ watch oc get storagecluster,pods -n openshift-storage

Once the storage cluster has been created successfully, the command will report Ready status.

$ oc get storagecluster -n openshift-storage
NAME              AGE   PHASE   EXTERNAL   CREATED AT          VERSION
ocs-storagecluster   1h   Ready           2024-03-11T11:22:03Z   4.14.11

List the available storage classes. You’ll see that ODF has created the following storage classes.

$ oc get sc
NAME                                      PROVISIONER                          RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
lso                                       kubernetes.io/no-provisioner         Delete       WaitForFirstConsumer   false               1h
ocs-storagecluster-ceph-rbd     openshift-storage.rbd.csi.ceph.com   Delete       Immediate           true                1h
ocs-storagecluster-ceph-rgw               openshift-storage.ceph.rook.io/bucket   Delete       Immediate           false               1h
ocs-storagecluster-cephfs                 openshift-storage.cephfs.csi.ceph.com   Delete       Immediate           true                1h
openshift-storage.noobaa.io               openshift-storage.noobaa.io/obc      Delete       Immediate           false               1h

The use cases of these storage classes are as follows:

ocs-storagecluster-ceph-rbd: This class supports block storage devices primarily used for high-performance workloads like databases.
ocs-storagecluster-cephfs: This class provides shared and distributed file system data services, primarily used for logging and data aggregation workloads.
openshift-storage.noobaa.io: This class provides the Multicloud Object Gateway (MCG) service, which provides multicloud object storage as an S3 API endpoint. This allows for the abstracting and retrieving of data from multiple cloud object stores.
ocs-storagecluster-ceph-rgw: This class provides on-premises object storage, primarily targeting data-intensive applications.

Learn KubeVirt & OpenShift Virtualization Backup & Recovery Best Practices

Using ODF block storage

ODF provides the storage class ocs-storagecluster-ceph-rbd for provisioning block volumes.

Let’s see more details about this class.

$$ oc describe sc ocs-storagecluster-ceph-rbd
Name:               ocs-storagecluster-ceph-rbd
IsDefaultClass:     No
Annotations:        description=Provides RWO Filesystem volumes, and RWO and RWX Block volumes,storageclass.kubernetes.io/is-default-class=true
Provisioner:        openshift-storage.rbd.csi.ceph.com
Parameters:         clusterID=openshift-storage,csi.storage.k8s.io/controller-expand-secret-name=rook-csi-rbd-provisioner,csi.storage.k8s.io/controller-expand-secret-namespace=openshift-storage,csi.storage.k8s.io/fstype=ext4,csi.storage.k8s.io/node-stage-secret-name=rook-csi-rbd-node,csi.storage.k8s.io/node-stage-secret-namespace=openshift-storage,csi.storage.k8s.io/provisioner-secret-name=rook-csi-rbd-provisioner,csi.storage.k8s.io/provisioner-secret-namespace=openshift-storage,imageFeatures=layering,deep-flatten,exclusive-lock,object-map,fast-diff,imageFormat=2,pool=ocs-storagecluster-cephblockpool
AllowVolumeExpansion:  True
MountOptions:       <none>
ReclaimPolicy:      Delete
VolumeBindingMode:  Immediate
Events:             <none>

The output above highlights the following details:

IsDefaultClass: No — This storage class is the default for persistent volume claims (PVCs) if no other storage class is specified.
Provisioner: openshift-storage.rbd.csi.ceph.com — This indicates that the storage class uses the Ceph RBD CSI driver to provision volumes.
Parameters: This section lists various parameters used by the CSI driver to configure the storage:
- clusterID=openshift-storage — Identifies the Ceph cluster to use
- pool=ocs-storagecluster-cephblockpool — Specifies the Ceph pool where the volumes will be created
- csi.storage.k8s.io/fstype=ext4 — Defines the filesystem type to be used on the volumes (ext4 in this case)
- Other parameters that configure secrets used for authentication and node access
AllowVolumeExpansion=True — Enables expanding the size of persistent volumes provisioned by this storage class
MountOptions: <none> — No specific mount options defined
ReclaimPolicy: Delete — Specifies that when a PVC using this storage class is deleted, the corresponding persistent volume will also be deleted
VolumeBindingMode: Immediate — Indicates that persistent volume claims using this storage class will be bound to a persistent volume as soon as they are created

Let’s create a persistent volume claim using the ocs-storagecluster-ceph-rbd storage class and verify the creation of the corresponding persistent volume.

Create a sample PVC in the default project.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: block-pvc
  namespace: default
spec:
  accessModes:
  - ReadWriteOnce
  storageClassName: ocs-storagecluster-ceph-rbd
  volumeMode: Filesystem
  resources:
    requests:
      storage: 1Gi

Verify that the PVC is created and immediately bound to the PV.

$ oc create -f pvc.yaml
persistentvolumeclaim/block-pvc created
$ oc get pvc -n default
NAME     STATUS   VOLUME                                  CAPACITY   ACCESS MODES   STORAGECLASS               AGE
block-pvc   Bound pvc-41a79ec7-7968-4d92-8f03-09f8b69f2d51   1Gi     RWO         ocs-storagecluster-ceph-rbd   35s
$ oc get pv |grep block
pvc-41a79ec7-7968-4d92-8f03-09f8b69f2d51   1Gi     RWO         Delete        Bound default/block-pvc                                ocs-storagecluster-ceph-rbd         35s

Applications can simply reference this PVC in their manifests to consume storage from this PV.

Recommendations for managing block storage in OpenShift

The following are some of the best practices for managing block storage in OpenShift:

When creating PVCs, request only the necessary storage capacity to avoid wasting space.
Choose an appropriate storage solution, such as OpenShift Data Foundation, which supports advanced features such as snapshots and clones.
Regularly review and delete old PVCs that are no longer needed.
Define appropriate limits and quotas for your storage to control consumption.
Regularly monitor storage usage and performance to identify potential issues and optimize resource utilization.

Learn How To Best Backup & Restore Virtual Machines Running on OpenShift

Conclusion

Block storage is a significant requirement for running stateful and high-performing workloads. However, implementing persistent block storage for containerized workloads has been challenging, as traditional storage options often need help integrating with orchestration platforms like OpenShift. OpenShift Data Foundation is a storage solution from RedHat that simplifies managing persistent block storage for containerized workloads deployed in OpenShift.

Table Of Contents

Like This Article?

Subscribe to our LinkedIn Newsletter to receive more educational content

Products

Trilio for Kubernetes

Trilio for OpenStack

Why Trilio

Continuous Recovery & Restore

Kubernetes Ransomware Protection

Platform

Backup for Red Hat OpenShift

Backup for Red Hat OpenShift Virtualization

Backup for Red Hat OpenStack

Use Cases

Backup and Recovery

Disaster Recovery

VMware to Red Hat OpenShift Virtualization Migration

VMware to Red Hat OpenStack Migration

Application Mobility

Kubernetes Workload Migration

Verticals

Telecom Providers

Financial Services

Become a Partner

TECHNOLOGY PARTNERS

DISTRIBUTORS

CLOUD PROVIDERS

SOLUTION PROVIDERS

RESELLERS

Tutorials

Customer Support

Case Studies

Newsletters

Press Releases

Podcasts

Video & Demo

White Papers

OpenStack Backup and Recovery

Kubernetes Backup and Recovery

Red Hat Virtualization

OVirt Backup and Recovery

About Trilio

Contact us

Reference Guide: Optimizing Backup Strategies for Red Hat OpenShift Virtualization

Managing Block Storage for OpenShift with OpenShift Data Foundation

Summary of key concepts of block storage for OpenShift

Block storage in OpenShift

Understanding ephemeral and persistent block storage in OpenShift

Understanding ephemeral storage

Understanding persistent storage

Automated Red Hat OpenShift Data Protection & Intelligent Recovery

Dynamic volume provisioning with storage classes and CSI drivers

Simplifying block storage with OpenShift Data Foundation (ODF)

OpenShift Data Foundation architecture

Learn KubeVirt & OpenShift Virtualization Backup & Recovery Best Practices

Deployment options

Setting up OpenShift Data Foundation

Prerequisites

Operator installation

Preparing local storage for ODF

Learn How To Best Backup & Restore Virtual Machines Running on OpenShift

Learn KubeVirt & OpenShift Virtualization Backup & Recovery Best Practices

Using ODF block storage

Recommendations for managing block storage in OpenShift

Learn How To Best Backup & Restore Virtual Machines Running on OpenShift

Conclusion

Like This Article?

Get in touch with us

Contact Us

We respond within 24 hours

4.9 out of 5 stars from 47 reviews

Products

Solutions

Legal

Let’s Connect!