Reference Guide: Optimizing Backup Strategies for Red Hat OpenShift Virtualization

Deploying an OpenShift cluster involves more than just a standard Kubernetes installation. Being an opinionated version, OpenShift predefines elements like the operating system and an immutable control plane, and it includes additional components for enhanced security and operational ease. 

Unlike Kubernetes, which can be installed on almost any general-purpose Linux-based distribution (such as Ubuntu or RedHat) using package managers, OpenShift has stricter infrastructure requirements. Starting with version 4.19, OpenShift requires Red Hat CoreOS (RHCOS) for its cluster nodes, enforcing an immutable, container-optimized OS for security and consistency. This design not only reduces configuration drift and attack surfaces but also simplifies installation compared to standard Kubernetes, thanks to the OpenShift installer automating the challenging aspects of the setup. 

This article explores OpenShift’s deployment process, how it differs from traditional Kubernetes setups, and the trade-offs between self-managed and managed options.

Summary of key OpenShift deployment considerations

The following table provides an overview of the key considerations when deploying OpenShift: 

Consideration

Description 

OpenShift vs. Kubernetes

OpenShift provides a curated Kubernetes experience with preintegrated security, networking, and tooling, whereas vanilla Kubernetes serves as a flexible base requiring additional setup for production-grade features.

Self-managed vs managed OpenShift

Self-managed OpenShift allows deep customization for specialized needs, while managed OpenShift simplifies operations with turnkey clusters, making it ideal for teams prioritizing developer productivity over infrastructure control.

Setting up OpenShift on Azure

Azure RedHat OpenShift (ARO) simplifies OpenShift deployment on Azure by managing the control plane and infrastructure, enabling developers to focus on applications while maintaining compatibility with standard OpenShift tooling and APIs.

Backing up OpenShift 

Trilio provides production-grade OpenShift backup with custom resource definitions and operator support, enabling full-cluster or namespace-level recovery to any point in time.

Automated Application-Centric Red Hat OpenShift Data Protection & Intelligent Recovery

Key differences between OpenShift and standard Kubernetes deployments

While Kubernetes is the foundation for container orchestration, OpenShift, Red Hat’s enterprise ready Kubernetes platform, builds upon it with opinionated design choices, enhanced security, and integrated tooling. The following are some of the key distinctions between OpenShift and a standard Kubernetes deployment.

Node requirements and immutable infrastructure

Unlike traditional Kubernetes, where nodes can run on any Linux distribution (e.g., Ubuntu or CentOS), OpenShift mandates Red Hat CoreOS (RHCOS) for control plane nodes. RHCOS is an immutable, container-optimized OS explicitly designed for OpenShift. Its immutability ensures that the control plane remains secure and consistent and critical files cannot be modified at runtime, reducing attack surfaces.

Worker nodes can use RHCOS or Red Hat Enterprise Linux (RHEL), but enforcing RHCOS for control planes highlights OpenShift’s focus on security by default. In contrast, vanilla Kubernetes can run on all major Linux distributions. While this offers great flexibility, it can lead to inconsistencies and potential vulnerabilities if the underlying OS is not appropriately hardened.

Built-in tooling and integrations

OpenShift ships with several integrated solutions that Kubernetes administrators would typically need to deploy separately:

  • OperatorHub: OpenShift includes OperatorHub by default, simplifying the installation and lifecycle management of Kubernetes Operators (e.g., databases and monitoring stacks). In contrast, vanilla K8s requires the manual setup of Operator frameworks like the Operator Lifecycle Manager (OLM).
  • Authentication: OpenShift provides built-in OAuth (via the OpenShift OAuth server), allowing seamless integration with identity providers (LDAP, GitHub, etc.). Standard Kubernetes relies on third-party solutions (e.g., Dex, Keycloak) or manual configuration.
  • Ingress and load balancing: OpenShift uses HAProxy-based routers as its default, primary ingress controller, whereas Kubernetes requires setting up ingress controllers (e.g., NGINX, Traefik) separately.

Security and compliance enforcement

OpenShift enforces stricter security policies than standard Kubernetes: SELinux is enabled by default, providing containers with mandatory access control (MAC). Role-based access control (RBAC) is preconfigured with sensible defaults, whereas Kubernetes leaves RBAC rules to the administrator.

Network policies are more streamlined, with OpenShift’s software-defined networking (SDN) offering multi-tenancy support. In contrast, while Kubernetes supports these features, they often require additional configuration and third-party tools to match OpenShift’s security posture.

Infrastructure Nodes

Since OpenShift is a commercial product, organizations with increasing application workloads may quickly encounter higher subscription costs as their infrastructure expands. As the containerized infrastructure increases, so does the underlying platform overhead for components like monitoring, logging, and routing. To mitigate this and to provide clear separation of duties, OpenShift allows for dedicated Infrastructure Nodes to host these core platform services, which can reduce the number of nodes counted against application workload subscriptions. OpenShift utilizes the concept of infrastructure nodes to enable customers to isolate infrastructure workloads, thereby preventing the incurring of billing costs against subscription counts and separating maintenance and management tasks. These nodes host only infrastructure components, such as the default router, the integrated container image registry, and the components for cluster metrics and monitoring. These infrastructure machines are not included in the total number of subscriptions required to run the environment.

Automated Red Hat OpenShift Data Protection & Intelligent Recovery

Perform secure application-centric backups of containers, VMs, helm & operators

Use pre-staged snapshots to instantly test, transform, and restore during recovery

Scale with fully automated policy-driven backup-and-restore workflows

OpenShift vs. Kubernetes: Summary of key deployment takeaways

The following table summarizes the deployment experience for both OpenShift and Kubernetes, highlighting key differences in setup, prerequisites, and post-deployment management. 

FeatureOpenShiftKubernetes
Deployment prerequisitesRequires Red Hat CoreOS (RHCOS) for the control plane (immutable infrastructure); workers can use RHELSupports any Linux OS (Ubuntu, CentOS, etc.) 
Deployment methodUses OpenShift installer (IPI/UPI) with automated, opinionated setupManual or tool-driven (kubeadm, kOps, etc.); more flexibility but higher configuration effort
Resource RequirementsDue to its integrated services like the registry, router, and monitoring stack, OpenShift has higher minimum resource requirements compared to a vanilla Kubernetes distribution.Kubernetes can be deployed with significantly lower baseline resource requirements as it only includes core components, leaving the user to add optional services as needed.
AuthenticationBuilt-in OAuth server (supports LDAP, GitHub, etc.)Requires third-party solutions (Dex, Keycloak) or manual configuration
Ingress configurationIncludes HAProxy-based router (out of the box)Needs manual ingress controller setup (NGINX, Traefik, etc.).
Add-on and service managementIntegrated OperatorHub (preloaded with certified operators)Requires manual installation of Operator Lifecycle Manager (OLM)
Default security postureEnforces stricter security policies by default (e.g., containers run as non-root), integrated OAuth, and security context constraints (SCCs)Need to configure security tools (SELinux, RBAC, SCCs) manually
NetworkingUses OpenShift SDN (with multi-tenancy support) or OVN-KubernetesRequires plugins like Calico, Flannel, or Cilium to be installed separately
RegistryBuilt-in integrated container registry (with image signing)Requires external registry setup (e.g., Harbor, Docker Registry)
Post-deployment upgradesAutomated with OpenShift Cluster Version Operator (CVO)Manual or tool-assisted (kubeadm, distro-specific methods)
User interfaceFeatures a comprehensive and intuitive web console for both developers and administratorsPrimarily command-line driven (kubectl); web dashboard is basic and often requires separate installation
Licensing and supportProprietary (Red Hat subscription required) with enterprise supportFree and open-source (community or vendor-supported distributions available)

OKD: The OpenShift upstream project

OKD, previously known as OpenShift Origin, is the community-driven upstream project that powers Red Hat OpenShift. It packages all the essential components needed to run Kubernetes and optimizes them for continuous application development and deployment.

Unlike OpenShift—a hardened, enterprise-ready product—OKD serves as the innovation hub where the community introduces and tests new features before refining them for enterprise adoption in Red Hat OpenShift. As a result, OKD is generally a few releases ahead of OpenShift, offering early access to cutting-edge capabilities.

OKD is ideal for developers who want to experiment with the latest container orchestration advancements before they reach OpenShift’s stable releases. However, since it lacks Red Hat’s commercial support and security certifications, OKD is best suited for testing, development, and environments where community support suffices.

OKD is where the OpenShift ecosystem evolves, while OpenShift itself delivers a polished, production-grade platform for enterprises.

OpenShift Virtualization Engine

Red Hat also offers another OpenShift variant, the OpenShift Virtualization Engine, which is a specialized edition for running virtual machines. It streamlines VM management by removing unrelated features, giving teams a focused solution for virtualization workloads. 

OpenShift virtualization leverages the KVM hypervisor, a virtualization module in the Linux kernel that allows the kernel to function as a type-1 hypervisor. It is a mature technology that major cloud providers use as the virtualization backend for their infrastructure-as-a-service (IaaS) offerings. 

OpenShift virtualization uses the KVM hypervisor to allow Kubernetes and KubeVirt to manage the virtual machines. As a result, the virtual machines use OpenShift’s scheduling, network, and storage infrastructure.

OpenShift installation options: self-managed vs. managed services

OpenShift offers multiple deployment models to fit different operational needs, from complete control to hands-off management. Below, we compare self-managed OpenShift (installed on premises or in the cloud) with managed OpenShift services.

Self-managed OpenShift deployment

The OpenShift container platform provides several options when deploying a cluster in any infrastructure. Four primary deployment methods are available, each of which provides a highly available infrastructure; the right choice depends on the specific use scenarios:

  • Assisted installer: This is the easiest way of deploying a cluster because it offers a web-based and user-friendly interface and is ideal for networks with access to the public Internet. It also offers smart defaults, pre-flight checks, and a REST API for automation. The assisted installer generates a discovery image, which is used to boot the cluster machines. 
  • Agent-based installation: This approach requires setting up a local agent and configuration via the command line; it is better suited to disconnected or restricted networks.
  • Automated installation: This method deploys an installer-provisioned infrastructure using the baseboard management controller on each cluster host. It works in both connected and disconnected environments.
  • Full control installation: This approach is ideal if you want complete control of the underlying infrastructure hosting the cluster nodes. It supports both connected and disconnected environments and provides maximum customization by deploying user-prepared and maintained infrastructure.

The automated installer approach is usually associated with installer-provisioned infrastructure (IPI), while the other methods are usually associated with user-provisioned infrastructure (UPI).

A high-level overview of the cluster installation is shown below.

Managed OpenShift deployments

If you’d rather not deal with infrastructure headaches, Red Hat’s managed OpenShift options are worth considering. Services like OpenShift Dedicated (ROSA) on AWS or Azure Red Hat OpenShift (ARO) handle the heavy lifting—Red Hat and the cloud provider handle cluster setup, maintenance, and security patches. This allows teams to focus entirely on building and deploying applications. 

There are some trade-offs, of course. While managed services offer convenience and built-in monitoring, you’ll have less control than running your OpenShift clusters. Pricing depends on your cloud provider, but the reduced operational burden justifies many businesses’ costs. 

These solutions work best for companies that need production-ready Kubernetes without building an in-house platform team. The major cloud platforms providing this service are shown in the table below.

Cloud ProviderManaged OpenShift platformManagementBilling
AWSRedHat OpenShift on AWS (ROSA)AWS and RedHatBilled via AWS
RedHat OpenShift Dedicated (OSD)RedHatSeparate Red Hat subscription/AWS infrastructure bill.
AzureAzure Red Hat OpenShift (ARO)Microsoft and RedHatBilled via Azure
GCPRed Hat OpenShift Dedicated (OSD)RedHatSeparate Red Hat subscription/GCP infrastructure bill.
IBMRed Hat OpenShift on IBM Cloud (ROKS)IBMIntegrated with IBM Cloud services.

Comparing self-managed and managed OpenShift deployments

The following table outlines the differences between managed and self-managed OpenShift deployments.

AspectSelf-managed OpenShiftManaged OpenShift
Provisioning and infrastructure managementFull infrastructure management requiredProvider-managed infrastructure
OperationsManual maintenance and upgradesAutomated maintenance
SecurityManual configuration requiredBuilt-in security controls
Team requirementsSpecialized skills neededReduced operational expertise
ScalabilityManual scaling processesAutomatic scaling capabilities
Cost structureHigher upfront capital expensesPredictable operational expenses
Deployment speedLonger deployment timelinesRapid cluster provisioning
ComplianceSelf-managed documentationProvider-maintained compliance

Step-by-Step: Azure Red Hat OpenShift deployment

The following walkthrough covers the essential stages for setting up an Azure Red Hat OpenShift (ARO) deployment.

Prerequisites

The ARO cluster can be created using the Azure CLI or the Azure portal. Using the Azure console provides a significant advantage for disaster recovery, as it enables customers to obtain an installation configuration file that can serve as a blueprint to recreate an ARO cluster with the same configuration precisely. This capability proves extremely helpful in scenarios where customers, conscious of cost, choose not to maintain a continuously running disaster recovery cluster. For instance, in the event of a DR, a customer can use this saved file to provision a new ARO cluster rapidly and immediately thereafter deploy Trilio. This process can be further accelerated through automation, directed to their application backup target, and then the critical task of restoring workloads can be commenced in a prioritized sequence.

The following guide focuses on creating a cluster using the Azure CLI, which can be installed by following the instructions here

The ARO deployment requires a minimum of 44 CPU cores to spin up a new cluster. This typically exceeds default Azure quotas for new subscriptions. If the current limits in your Azure account are too low, you’ll need to submit a quota increase request specifically for VM vCPUs before proceeding.

Here’s how those cores get allocated during installation:

  • Bootstrap nodes: 8 cores power the temporary bootstrap machine.
  • Control plane: 24 cores are dedicated to control plane nodes.
  • Worker nodes: 12 cores are for compute workloads.

Once the installation finishes, the bootstrap machine disappears, bringing the core usage down to 36.

By default, the cluster installation creates three control plane nodes and three worker nodes. This is the minimum number of nodes required for the cluster to be supported by Microsoft and Red Hat. Reducing the cluster size to less than this configuration would violate the support agreement. A maximum of 250 worker nodes is supported.

To access the cluster post installation, make sure to download the desired OpenShift CLI (oc) version from here.

By default, the ARO installation uses the Standard D8s_v5 virtual machine size for control nodes and Standard D4s_v5 for worker nodes. You can use the following command to check the available cores for the Standard DSv5 VM family in the East US region:

LOCATION=eastus
az vm list-usage -l $LOCATION --query "[?contains(name.value, 'standardDSv5Family')]" --output table

Before creating the ARO deployment, you must set up the networking infrastructure and verify your access permissions. The deployment requires the following:

  • Resource group setup: You’ll create a dedicated resource group containing the cluster’s virtual network.
  • Required permissions: You’ll need the permissions shown in the table below to deploy the cluster.
Permission scopeRequired rolesComments
Virtual network

Contributor + User Access Administrator

OR

Owner

Can be assigned at the VNet, resource group, or subscription level
Microsoft Entra ID

Tenant member user

OR

Guest user with Application Admin role

Needed for service principal creation; required if using guest account for cluster tooling operations

Registering resource providers

Before proceeding for cluster deployment, you need to register the following essential resource providers in your Azure subscription:

  • Microsoft.RedHatOpenShift
  • Microsoft.Compute
  • Microsoft.Storage
  • Microsoft.Authorization

If your account has multiple Azure subscriptions, specify the relevant subscription ID:

az account set --subscription

You can check if a particular resource provider is currently registered in your account.

az provider list --query "[?namespace==''].registrationState" --output table

If the resource providers are not registered, you can register them as follows:

az provider register --namespace Microsoft.RedHatOpenShift --wait
az provider register --namespace Microsoft.Compute --wait
az provider register --namespace Microsoft.Storage --wait
az provider register --namespace Microsoft.Authorization --wait

You can then verify that the resource providers have been registered by using the following commands:

az provider list --query "[?namespace=='Microsoft.RedHatOpenShift'].registrationState" --output table
az provider list --query "[?namespace=='Microsoft.Compute'].registrationState" --output table
az provider list --query "[?namespace=='Microsoft.Storage'].registrationState" --output table
az provider list --query "[?namespace=='Microsoft.Authorization'].registrationState" --output table

Downloading a pull secret

The pull secret allows your cluster to access Red Hat container registries and pull images from them. You can download the pull secret for your cluster by navigating to the cluster manager portal. Select Download pull secret and download a pull secret to be used with your ARO deployment. 

Although setting the pull secret during cluster creation is optional, it is recommended to include this step.

Creating resource groups

If there are existing virtual networks in your account, you can use them in your ARO deployment. You can also create a new virtual network for your ARO deployment. Here, we’ll create a new virtual network for our ARO deployment. The following variables must be set in the shell environment.

LOCATION=eastus              # the location of your cluster
RESOURCEGROUP=aro-rg         # the name of the resource group       #where you want to create your cluster
CLUSTER=cluster              # the name of your cluster
VIRTUALNETWORK=aro-vnet      # the name of the virtual network

Resource groups act as logical containers for organizing and managing your Azure services. When creating one, you’ll select a geographic location that serves two purposes:

  • Stores metadata about the group
  • Becomes the default deployment region for contained resources (unless overridden)

A resource group can be created using the following command:

az group create --name $RESOURCEGROUP --location $LOCATION

The ARO deployment process will automatically create a second, managed resource group to hold the cluster’s infrastructure resources, such as virtual machines, storage, and networking components. Modification or deletion of resources within this managed resource group is not supported, as it can destabilize the cluster.

It’s important to point out that not all Azure regions support Red Hat OpenShift deployments. Make sure you choose from the available regions before creating the resource group.

Creating virtual network and subnets

Create a new virtual network in the resource group created in the previous step:

az network vnet create --resource-group $RESOURCEGROUP --name VIRTUALNETWORK --address-prefixes 10.10.0.0/21

Next, create two empty subnets for control plane and worker nodes.

az network vnet subnet create --resource-group $RESOURCEGROUP --vnet-name $VIRTUALNETWORK --name master-subnet --address-prefixes 10.10.0.0/21
az network vnet subnet create --resource-group $RESOURCEGROUP --vnet-name $VIRTUALNETWORK --name worker-subnet --address-prefixes 10.0.2.0/21

To learn more about networking options in ARO, refer to this guide.

Modifying default options 

You can modify the cluster creation command based on the following options:

  • Red Hat registry access: Pass the pull secret for accessing images from RedHat registries (–pull-secret @pull-secret.txt).
  • Custom domain configuration: You can create a custom domain for your cluster by following the instructions here. Use the “–domain” flag to specify your doman.
  • VM size adjustments: The default virtual machine sizes for your control plane and worker nodes are Standard D8s_v5 and Standard D4s_v5, respectively. If you need to create a virtual machine of a different size, you can use the following flags:
    • –master-vm-size Standard_D8s_v3
    • –worker-vm-size Standard_D4s_v3
  • Version specification: You can select a specific version of ARO when deploying your cluster. You can check for available versions first:
az aro get-versions --location

Creating the cluster

Once all these details have been finalized, you can proceed to create your cluster as follows:

az aro create --resource-group $RESOURCEGROUP --name $CLUSTER --vnet $VIRTUALNETWORK --master-subnet master-subnet --worker-subnet worker-subnet --pull-secret @pull-secret.txt

Accessing the cluster

When the cluster deployment is complete, you can connect to the cluster using the default “kubeadmin” user. You can retrieve the cluster console URL and the credentials as follows:

az aro list-credentials --name $CLUSTER --resource-group $RESOURCEGROUP

az aro show --name $CLUSTER --resource-group $RESOURCEGROUP --query "consoleProfile.url" --output tsv

To access the cluster using the “oc” command,  retrieve the API endpoint using the following command:

apiServer=$(az aro show --resource-group $RESOURCEGROUP --name $CLUSTER --query apiserverProfile.url --output tsv)

You can then login to the cluster API as follows:

oc login $apiServer --username kubeadmin --password

For a comprehensive list of supported configurations and restrictions, refer to the official Azure Red Hat OpenShift support policy.

Checking cluster health

The oc CLI command can be used to verify that the cluster is running and healthy. The following commands provide a basic health check of the cluster’s core components.

First, ensure that the control plane (master) and worker nodes are all running and in a Ready state.

oc get nodes

The following command can be used to check the status of core cluster operators. The cluster operators manage the fundamental components of the OpenShift platform. A healthy cluster will show all operators with AVAILABLE as True, and PROGRESSING as False, with no Degraded operators.

oc get clusteroperators

Check the status of core cluster pods; they should all be in a healthy state.

oc get pods --all-namespaces

Scaling worker nodes

We can add worker nodes to the cluster as per our requirements via machine sets. To determine the current configuration, examine the existing machine sets in the cluster. The following command shows the number of control plane nodes and worker nodes, the associated node types, and the region/zone where they are deployed.

oc get machine -n openshift-machine-api

This scaling operation can be accomplished either through the Command Line Interface (CLI) or directly via the OpenShift Web Console. From the terminal, an administrator can run a command to imperatively scale up a specific machine set to two replicas, thus increasing the total worker node count by one. Because each machine set is typically tied to a particular availability zone, and the initial state is three machine sets with one machine each, scaling one machine set to two machines results in a total of four worker nodes in the cluster. Run the following command to scale the desired machine set to increase the count of worker nodes to four.

oc scale --replicas=2 machineset  -n openshift-machine-api

Shutting down the cluster

The cluster can be shut down in order to save costs. To gracefully shutdown the cluster, run the following command.

for node in $(oc get nodes -o jsonpath='{.items[*].metadata.name}'); do oc debug node/${node} -- chroot /host shutdown -h 1; done

Deleting the cluster

When a cluster is deleted, all managed objects are removed. However, resources like the resource group, virtual network, and subnets must be manually deleted.

az login

Select the subscription ID you want to use.

az account set --subscription {subscription ID}

Replace the following with the values used to create the cluster, then run the command to delete the cluster.

RESOURCEGROUP=
CLUSTER=
az aro delete --resource-group $RESOURCEGROUP --name $CLUSTER

Run the following command to delete the resource group.

az group delete --name $RESOURCEGROUP

The ARO deployment process can also be automated using the Azure collection in Ansible. The following playbook can be used as a sample for ARO deployment. Further details about using Ansible for ARO deployment can be found here.

- name: Create openshift cluster
  azure_rm_openshiftmanagedcluster:
    resource_group: "myResourceGroup"
    name: "myCluster"
    location: "eastus"
    cluster_profile:
      cluster_resource_group_id: "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/clusterResourceGroup"
      domain: "mydomain"
    service_principal_profile:
      client_id: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
      client_secret: "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
    network_profile:
      pod_cidr: "10.128.0.0/14"
      service_cidr: "172.30.0.0/16"
    worker_profiles:
      - name: worker
        vm_size: "Standard_D4s_v3"
        subnet_id: "/subscriptions/xx-xx-xx-xx-xx/resourceGroups/myResourceGroup/Microsoft.Network/virtualNetworks/myVnet/subnets/worker"
        disk_size: 128
        count: 3
    master_profile:
      vm_size: "Standard_D8s_v3"
      subnet_id: "/subscriptions/xx-xx-xx-xx-xx/resourceGroups/myResourceGroup/providers/Microsoft.Network/virtualNetworks/myVnet/subnets/master"
- name: Create openshift cluster with multi parameters
  azure_rm_openshiftmanagedcluster:
    resource_group: "myResourceGroup"
    name: "myCluster"
    location: "eastus"
    cluster_profile:
      cluster_resource_group_id: "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/clusterResourceGroup"
      domain: "mydomain"
      fips_validated_modules: Enabled
    service_principal_profile:
      client_id: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
      client_secret: "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
    network_profile:
      pod_cidr: "10.128.0.0/14"
      service_cidr: "172.30.0.0/16"
      outbound_type: Loadbalancer
      preconfigured_nsg: Disabled
    worker_profiles:
      - name: worker
        vm_size: "Standard_D4s_v3"
        subnet_id: "/subscriptions/xx-xx-xx-xx-xx/resourceGroups/myResourceGroup/Microsoft.Network/virtualNetworks/myVnet/subnets/worker"
        disk_size: 128
        count: 3
        encryption_at_host: Disabled
    master_profile:
      vm_size: "Standard_D8s_v3"
      subnet_id: "/subscriptions/xx-xx-xx-xx-xx/resourceGroups/myResourceGroup/providers/Microsoft.Network/virtualNetworks/myVnet/subnets/master"
      encryption_at_host: Disabled

Backing up OpenShift using Trilio

OpenShift hosts mission-critical applications, configurations, and persistent data. Losing any of these can lead to severe downtime, data corruption, or compliance violations. Backups ensure disaster recovery, migration flexibility, and protection against accidental deletions, cyberattacks, or cluster failures.

Several tools are commonly used for backing up OpenShift, each with limitations. One of the more popular open-source solutions is Velero, which provides basic backup capabilities but lacks application-consistent snapshots. Commercial solutions such as Kasten K10 offer policy-driven backups but have limited OpenShift-specific optimizations and can be complex to deploy in multi-tenant environments. Azure Kubernetes Service (AKS) Backup also offers a native, Azure-focused backup tool. Still, it is often restricted by being tied only to Azure services and lacking the granular scope needed for complex OpenShift workloads. There is also the option to use native storage snapshots from cloud providers, but they only capture disk states without Kubernetes object consistency, requiring additional manual recovery steps.

This is where solutions like Trilio come into play. Trilio is engineered as an enterprise-grade platform designed to address modern organizations’ complex data protection and disaster recovery needs. Unlike generic backup tools, Trilio ensures proper application-consistent backups, which are critical for databases and stateful workloads. It natively supports OpenShift Operators and custom resources, enabling the seamless backup and recovery of Helm releases and CRDs. With granular recovery options, administrators can restore individual namespaces or entire clusters while avoiding vendor lock-in.

When used with a solution such as ARO, Trilio can offer the following capabilities:

  • Automated backups: Trilio enables scheduling point-in-time backups and offers flexible recovery options.
  • Continuous restore: Trilio’s continuous restore capabilities enable building effective disaster recovery strategies. Regardless of your cloud provider, they ensure that applications can be quickly restored after a disaster. Trilio’s innovative architecture allows multiple primary application clusters to be continuously converted to one dedicated disaster recovery (DR) cluster. This approach cuts down on infrastructure costs compared to a traditional one-to-one primary-to-DR cluster model.
  • Migration and platform portability: Trilio empowers organizations to seamlessly migrate applications between diverse environments, such as Azure Red Hat OpenShift and Red Hat OpenShift Service on AWS (ROSA), or even on-premises clusters. For instance, customers can easily migrate applications back to on-premises infrastructure if cloud costs become prohibitive.
  • Multi-cluster management: The integration of Trilio with Red Hat Advanced Cluster Management for Kubernetes (RHACM) facilitates the definition and orchestration of policy-driven data protection across a diverse range of Kubernetes deployments, including hybrid, multi-cloud, and edge environments.

The following table compares Trilio against some of the other backup solutions.

 

Feature

Traditional backup solutions (AKS Backup, Velero, Kasten)

 

Trilio

 

Application Scope and Consistency

 

These tools offer limited scope, often missing essential application components like Helm charts, VMs, operators, and user-defined labels. Velero, a popular open-source option, notably lacks native application-consistent snapshots, posing risks to stateful workloads and databases.

 

Offers significantly more granular backup options, including native support for namespaces, user-defined labels, stateful operators, and complex Helm charts, ensuring complete application-consistent recovery.

Storage Flexibility & Data Sovereignty

Solutions are often restricted to the cloud provider’s native storage tiers (like AKS Backup) or require significant manual configuration for integrating external or diverse S3 providers for cost-effective tiering and achieving data sovereignty.

Supports a wide variety of storage targets and external S3 providers, enabling greater data sovereignty, cost optimization through flexible tiering, and avoiding native cloud storage restrictions.

 

Disaster Recovery & Vendor Lock-in

 

Native tools like AKS Backup are limited to the same cloud/region, offering no support for cross-cloud migrations or true hybrid-cloud DR. The resulting vendor lock-in makes moving applications to different platforms complex and resource-intensive.

 

Facilitates true cross-cloud migrations and robust Disaster Recovery (DR) capabilities. Trilio avoids vendor lock-in by providing portable backup data and flexibility to restore across different cloud providers or on-premises environments.

 

Enterprise Management & Identity

 

Management interfaces are often fragmented across native cloud consoles and complex Kubernetes tooling, lacking a unified, advanced, multi-cluster, or multi-tenant user interface. Additionally, identity support can be restricted (e.g., AKS Backup only supporting Managed System Identity).

 

Provides an advanced multicluster, hybrid cloud, multi-tenant UI for centralized management and visibility. It also offers support for multiple identity management methods, catering to diverse enterprise security policies.

Best practices for deploying OpenShift

Here are some of the recommended practices for deploying OpenShift:

  • Engage in proper capacity planning: Thoroughly assess and allocate resources before cluster deployment. Sufficient compute and memory resources must be allocated to cluster nodes so that workloads are not adversely affected. Worker node capacity should account for current application needs and projected growth, typically with a 20-30% buffer. High-performance storage like NVMe is strongly recommended for etcd backends to maintain cluster responsiveness.
  • Implement strong network security: Build in strict network segmentation from day one. Isolate control plane traffic from worker nodes and external connections using dedicated VLANs or network policies. Restrict traffic between namespaces and applications through fine-grained network policies. Place ingress controllers in a secured perimeter zone with mandatory TLS encryption for all external traffic.
  • Optimize for performance: When deploying OpenShift in on-premise networks, carefully configure API and ingress Virtual IPs (VIPs) on low-latency networks (<2 ms response time). Conduct load testing to validate VIP configurations under peak traffic conditions. Distribute ingress endpoints across availability zones for redundancy and balance API server loads across control plane nodes.
  • Protect data: Implement a comprehensive backup strategy before moving to production. This should include hourly etcd backups and daily application data snapshots stored across geographically separate locations.
  • Ensure operational preparedness: Maintain detailed runbooks for critical operations, including certificate rotations, node replacements, and emergency recovery procedures.

Learn How To Best Backup & Restore Virtual Machines Running on OpenShift

Conclusion

Creating OpenShift clusters requires careful planning, whether deploying managed services like ARO/ROSA or self-managed installations. For quick deployment with reduced operational overhead, managed OpenShift solutions offer the fastest path to production with built-in maintenance and scaling. Self-managed clusters provide greater customization but demand more expertise for setup, upgrades, and day-to-day operations.

The deployment approach should align with your team’s skills and workload requirements: managed services for teams wanting to focus on applications and self-managed for those needing granular control. Regardless of the method, all OpenShift deployments benefit from proper resource allocation, network segmentation, and backup strategies.

Table Of Contents

Like This Article?

Subscribe to our LinkedIn Newsletter to receive more educational content

Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.