Reference Guide: Optimizing Backup Strategies for Red Hat OpenShift Virtualization

All FAQs (Helpie FAQ)

Sample of All FAQs (Helpie FAQ)

Helpie FAQ

  • Follow these steps:

    oc -n trilio-system delete secret triliovault-dex (delete the trilio-dex secret)
    oc -n trilio-system delete po <admission web hook pod> (Restart the k8s-triliovault-admission-webhook pod)
  • On undercloud host run the following command

            openstack tripleo container image list | grep trilio
  • Run the following command on compute node

    sudo podman ps --all --format "{{.Names}} {{.Ports}} {{.Mounts}} {{.Status}}"
    
  • RHOSP:

    # Pre-5.x
    /var/log/containers/trilio-datamover-api/dmapi.log # controllers
    /var/log/containers/trilio-datamover/tvault-contego.log # computes
    /var/log/tvault-object-store/tvault-object-store.log # inside the trilio-datamover container
    
    # v5.x
    /var/log/containers/triliovault-wlm-api/triliovault-wlm-api.log # controllers
    /var/log/containers/triliovault-wlm-api/triliovault-object-store.log # controllers
    /var/log/containers/triliovault-wlm-cron/triliovault-wlm-cron.log # controllers
    /var/log/containers/triliovault-wlm-cron/triliovault-object-store.log # controllers
    /var/log/containers/triliovault-wlm-scheduler/triliovault-wlm-scheduler.log # controllers
    /var/log/containers/triliovault-wlm-scheduler/triliovault-object-store.log # controllers
    /var/log/containers/triliovault-wlm-workloads/triliovault-wlm-workloads.log # controllers
    /var/log/containers/triliovault-wlm-workloads/triliovault-object-store.log # controllers
    /var/log/containers/triliovault-datamover-api/triliovault-datamover-api.log # controllers
    /var/log/containers/triliovault-datamover/triliovault-datamover.log # computes
    /var/log/containers/triliovault-datamover/triliovault-object-store.log # computes
    
  • 1. Grab the restore ID:

    workloadmgr snapshot-list --workload_id a0e61dc0-14eb-4e43-8c4f-ae25ac4fa8c0
    workloadmgr restore-list --snapshot_id 546b542a-9fba-4484-848d-da730b4e46c6

    2. In the TVM logs, get the ID of the instance that was trying to spin up:

    sudo cat /var/log/workloadmgr/workloadmgr-workloads.log* | grep CopyBackupImageToVolume.execute.*ENTER | grep <restore_id> | awk '{print $9}' | cut -f1 -d"," | sort | uniq
    

    3. Grab the host from the OpenStack DB (inside the OpenStack controller):

    # Ansible
    sudo lxc-attach `sudo lxc-ls -f | grep galera | awk '{print $1}'` -- mysql -u root -p`sudo cat /etc/openstack_deploy/user_secrets.yml | grep ^galera_root_password | awk '{print $NF}'` -e "select uuid,display_name,node,launched_on,created_at,deleted_at from nova.instances where uuid='a2454040-d4ed-4711-8d5e-6b3637f69aa9'\G"
    
    # Kolla
    sudo docker exec -itu root `sudo docker ps | grep mariadb | awk '{print $NF}'` mysql -u root -p`sudo cat /etc/kolla/passwords.yml | grep ^database_password | awk '{print $NF}'` -e "select uuid,display_name,node,launched_on,created_at,deleted_at from nova.instances where uuid='a2454040-d4ed-4711-8d5e-6b3637f69aa9'\G"
    
    # RHOSP13
    sudo docker exec -itu root `sudo docker ps -q -f name=galera` mysql -u root -p`sudo hiera -c /etc/puppet/hiera.yaml "mysql::server::root_password"` -e "select uuid,display_name,node,launched_on,created_at,deleted_at from nova.instances where uuid='a2454040-d4ed-4711-8d5e-6b3637f69aa9'\G"
    
    # RHOSP16
    sudo podman exec -itu root `sudo podman ps -q -f name=galera` mysql -u root -p`sudo hiera -c /etc/puppet/hiera.yaml "mysql::server::root_password"` -e "select uuid,display_name,node,launched_on,created_at,deleted_at from nova.instances where uuid='a2454040-d4ed-4711-8d5e-6b3637f69aa9'\G"
    
    # Canonical
    MYSQLPWD=$(juju exec --unit mysql-innodb-cluster/leader leader-get mysql.passwd)
    juju exec --unit mysql-innodb-cluster/leader "mysql -uroot -p${MYSQLPWD} -e \"select uuid,display_name,node,launched_on,created_at,deleted_at from nova.instances where uuid='a2454040-d4ed-4711-8d5e-6b3637f69aa9'\"\\\G"
    
  • sudo docker ps -a | grep -Ei 'trilio|tvault|workloadmgr|s3|dmapi|dm_api|datamover|data_mover|contego' # both controllers and computes # if RHOSP 16 or higher, use 'podman'
    

    Then, compare the versions with the official documentation [1]. Navigate through the different hotfixes to find the one that matches all the versions of your WLM elements.

    [1] https://docs.trilio.io/openstack/v/tvo-4.2/triliovault-4.2-release-notes

  • No T4K can’t take backup of HotPlug disks connected to Virtual Machines running on OpenShift.

  • Install the Krew plugin to help collect and submit log bundles whenever you encounter issues with Trilio backup and restore. Detailed instructions can be found here: TVK Log Collector Documentation https://docs.trilio.io/kubernetes/krew-plugins/tvk-log-collector.

  • User needs to create master encryption in trilio-system NS using this document https://docs.trilio.io/kubernetes/getting-started/using-trilio/post-install-configuration#encryption

    Post hat user can easily create encrypted backup in UI where using UI user can create secret too. Please refer this document https://docs.trilio.io/kubernetes/getting-started/using-trilio/getting-started-with-management-console/index/creating-backups#encrypting-backups

    To restore to the same cluster, the user needs to have the same key to present both master encryption secret and  encryption secret that was created at NS level.

    To restore to different cluster users first need to have both master encryption secret and  encryption secret that was created at NS level.

  • Trilio uses Custom Resource Definitions (CRDs) and operates within project scope in OpenShift, allowing similar Role-Based Access Control (RBAC) to be applied to Trilio resources. Users can control actions on specific Trilio resources, ensuring proper access management. For more details, please refer to the Trilio RBAC documentation.

  • # Upstream

    kubectl patch triliovaultmanager triliovault-manager -p '{"spec":{"logLevel":"Debug","datamoverLogLevel":"Debug"}}' --type merge
    #kubectl edit triliovaultmanager triliovault-manager
    #kubectl get configmap k8s-triliovault-config -o yaml | grep -i loglevel
    #kubectl patch triliovaultmanager triliovault-manager -p '{"spec":{"logLevel":null,"datamoverLogLevel":null}}' --type merge # revert to standard level
    

    # RHOCP/OKD (v3.x or higher)

    oc patch triliovaultmanager triliovault-manager -n trilio-system -p '{"spec":{"logLevel":"Debug","datamoverLogLevel":"Debug"}}' --type merge
    #oc edit triliovaultmanager triliovault-manager -n trilio-system
    #oc get configmap k8s-triliovault-config -n trilio-system -o yaml | grep -i loglevel
    #oc patch triliovaultmanager triliovault-manager -n trilio-system -p '{"spec":{"logLevel":null,"datamoverLogLevel":null}}' --type merge # revert to standard level
    

    # RHOCP/OKD (Pre-3.x)

    oc patch configmap k8s-triliovault-config -n openshift-operators -p '{"data":{"tvkConfig":"name: tvk-instance\nlogLevel: Debug\ndatamoverLogLevel: Debug"}}'
    #oc edit configmap -n openshift-operators k8s-triliovault-config
    #oc patch configmap k8s-triliovault-config -n openshift-operators -p '{"data":{"tvkConfig":"name: tvk-instance"}}' # revert to standard level
    
  • # Upstream

    kubectl version # alternative: kubectl get nodes -o yaml | grep -w kubeletVersion
    kubectl describe deployment k8s-triliovault-control-plane | grep RELEASE_TAG
    

    # RHOCP/OKD

    oc version # RHOCP/OKD version. Alternative: oc get nodes -o yaml | grep -w containerRuntimeVersion # rhaos<ocp_version>.<commit>.<rhel_version>
    oc get nodes # upstream K8s version running on each node. Alternative: oc get nodes -o yaml | grep -w kubeletVersion
    oc describe deployment k8s-triliovault-control-plane -n trilio-system | grep RELEASE_TAG # v3.x or higher
    oc describe deployment k8s-triliovault-control-plane -n openshift-operators | grep RELEASE_TAG # Pre-3.x
    
  • No T4K can’t take backup of HotPlug disks connected to Virtual Machines running on OpenShift.

  • During the installation of Trilio, you can enable observability to view logs directly in the UI instead of using the CLI. For detailed instructions, please refer to the Trilio Observability Integration documentation.

    https://docs.trilio.io/kubernetes/advanced-configuration/observability/tvk-integration-with-observability-stack
  • Connecting to and testing a S3 bucket:

    sudo apt install -y awscli # Ubuntu
    sudo yum install -y awscli # RHEL
    #curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" && unzip awscliv2.zip && sudo ./aws/install # if no repos are available
    #sudo pip3 install awscli # this command can also be used if running from a TVM. Using sudo is MANDATORY so we don't touch the virtual environment!
    
    export AWS_ACCESS_KEY_ID=Z3GZYLQN7Jaaaaaaaaab
    export AWS_SECRET_ACCESS_KEY=abcdefghvlkdNzvvzmrkFpd1R5pKg4aoME7IhSXp
    export AWS_DEFAULT_REGION=default
    export AWS_REGION=${AWS_DEFAULT_REGION}
    #export AWS_CA_BUNDLE=./ca-bundle.crt # to specify a CA bundle. To completely bypass SSL check, add --no-verify-ssl to aws commands
    export S3_ENDPOINT_URI=http://ceph.trilio.demo/ # or http://172.22.0.3/
    export S3_TEST_BUCKET_URI=bucket1
    
    dd if=/dev/urandom of=./testimage.img bs=1K count=102400 iflag=fullblock # create a 100MB test image
    aws s3 --endpoint-url $S3_ENDPOINT_URI mb s3://${S3_TEST_BUCKET_URI}
    aws s3 --endpoint-url $S3_ENDPOINT_URI ls | grep ${S3_TEST_BUCKET_URI} # confirm the bucket exists
    aws s3 --endpoint-url $S3_ENDPOINT_URI cp ./testimage.img s3://${S3_TEST_BUCKET_URI}/
    aws s3 --endpoint-url $S3_ENDPOINT_URI ls s3://${S3_TEST_BUCKET_URI} | grep testimage.img # confirm the image is in the bucket
    aws s3 --endpoint-url $S3_ENDPOINT_URI rm s3://${S3_TEST_BUCKET_URI}/testimage.img
    rm -f ./testimage.img
    #aws s3 --endpoint-url $S3_ENDPOINT_URI rb s3://${S3_TEST_BUCKET_URI} # only if this bucket was created only for this purpose. Add --force to forcefully delete all contents
    #Note: if any issues are found while running above aws commands, get more detail by adding the flag --debug
    
  • The paths inside the target have the following format:

    ./<target_path>/76eef5b5-c47e-4a0e-a678-2d5c07e14cc1/263a4aae-bde1-49cd-8685-b3869328f0b4
    
    ^ First UID is from BackupPlan: kubectl get backupplan <bakplan> -o jsonpath='{.metadata.uid}'
    ^ Second UID is from Backup: kubectl get backup <bak> -o jsonpath='{.metadata.uid}'
    
    # When using ClusterBackup, we will have three directories under <target_path> (considering we are backing up 2 namespaces):
    # ./<target_path>/<ClusterBackupPlan_UID>/<ClusterBackup_UID>
    # ./<target_path>/<BackupPlan_NS1_UID>/<Backup_NS1_UID>
    # ./<target_path>/<BackupPlan_NS2_UID>/<Backup_NS2_UID>
    
  • # Upstream

    kubectl patch triliovaultmanager triliovault-manager -p '{"spec":{"logLevel":"Debug","datamoverLogLevel":"Debug"}}' --type merge
    #kubectl edit triliovaultmanager triliovault-manager
    #kubectl get configmap k8s-triliovault-config -o yaml | grep -i loglevel
    #kubectl patch triliovaultmanager triliovault-manager -p '{"spec":{"logLevel":null,"datamoverLogLevel":null}}' --type merge # revert to standard level
    

    # RHOCP/OKD (v3.x or higher)

    oc patch triliovaultmanager triliovault-manager -n trilio-system -p '{"spec":{"logLevel":"Debug","datamoverLogLevel":"Debug"}}' --type merge
    #oc edit triliovaultmanager triliovault-manager -n trilio-system
    #oc get configmap k8s-triliovault-config -n trilio-system -o yaml | grep -i loglevel
    #oc patch triliovaultmanager triliovault-manager -n trilio-system -p '{"spec":{"logLevel":null,"datamoverLogLevel":null}}' --type merge # revert to standard level
    

    # RHOCP/OKD (Pre-3.x)

    oc patch configmap k8s-triliovault-config -n openshift-operators -p '{"data":{"tvkConfig":"name: tvk-instance\nlogLevel: Debug\ndatamoverLogLevel: Debug"}}'
    #oc edit configmap -n openshift-operators k8s-triliovault-config
    #oc patch configmap k8s-triliovault-config -n openshift-operators -p '{"data":{"tvkConfig":"name: tvk-instance"}}' # revert to standard level
    
  • 1. Run:

    (
      set -x; cd "$(mktemp -d)" &&
      OS="$(uname | tr '[:upper:]' '[:lower:]')" &&
      ARCH="$(uname -m | sed -e 's/x86_64/amd64/' -e 's/\(arm\)\(64\)\?.*/\1\2/' -e 's/aarch64$/arm64/')" &&
      KREW="krew-${OS}_${ARCH}" &&
      curl -fsSLO "https://github.com/kubernetes-sigs/krew/releases/latest/download/${KREW}.tar.gz" &&
      tar zxvf "${KREW}.tar.gz" &&
      ./"${KREW}" install krew
    )
    

    2. Add the following line to ~/.bashrc:

    export PATH="${KREW_ROOT:-$HOME/.krew}/bin:$PATH"
    

    3. Install T4K Log Collector:

    kubectl krew index add tvk-plugins https://github.com/trilioData/tvk-plugins.git
    kubectl krew update
    kubectl krew install tvk-plugins/tvk-log-collector
    #kubectl krew upgrade tvk-log-collector # upgrade the tvk-log-collector plugin
    #kubectl krew uninstall tvk-log-collector # uninstall
    

    4. Use it:

    kubectl tvk-log-collector --clustered --log-level debug

    More information here: https://docs.trilio.io/kubernetes/v/3.0.x/krew-plugins/tvk-log-collector

  • # Upstream

    kubectl version # alternative: kubectl get nodes -o yaml | grep -w kubeletVersion
    kubectl describe deployment k8s-triliovault-control-plane | grep RELEASE_TAG
    

    # RHOCP/OKD

    oc version # RHOCP/OKD version. Alternative: oc get nodes -o yaml | grep -w containerRuntimeVersion # rhaos<ocp_version>.<commit>.<rhel_version>
    oc get nodes # upstream K8s version running on each node. Alternative: oc get nodes -o yaml | grep -w kubeletVersion
    oc describe deployment k8s-triliovault-control-plane -n trilio-system | grep RELEASE_TAG # v3.x or higher
    oc describe deployment k8s-triliovault-control-plane -n openshift-operators | grep RELEASE_TAG # Pre-3.x
    
  • A. On Trilio Appliance source the credentials of cloud administrator

    B. Run this command source /home/stack/myansible/bin/activate

     C. Create cloud admin trust using following command:

    Syntax of command: workloadmgr trust-create [--is_cloud_trust {True,False}] <role_name>
                  Example command:
                  workloadmgr trust-create --is_cloud_trust True admin
                  workloadmgr trust-create --is_cloud_trust True Admin (This command is for canonical based openstack where admin role is Admin)
    
  • Retrieve the value of verbose and debug flag using following commands:

     juju config trilio-wlm verbose
            juju config trilio-wlm debug
    

     Enable verbose and debug to true, use following commands:

       juju config trilio-wlm verbose=True
           juju config trilio-wlm debug=True
    
  • Grab the restore ID:

    workloadmgr snapshot-list --workload_id a0e61dc0-14eb-4e43-8c4f-ae25ac4fa8c0
    workloadmgr restore-list --snapshot_id 546b542a-9fba-4484-848d-da730b4e46c6
    

     In the TVM logs, get the ID of the instance that was trying to spin up:

    sudo cat /var/log/workloadmgr/workloadmgr-workloads.log* | grep CopyBackupImageToVolume.execute.*ENTER | grep <restore_id> | awk '{print $9}' | cut -f1 -d"," | sort | uniq
    

    3. Grab the host from the OpenStack DB (inside the OpenStack controller):

    MYSQLPWD=$(juju exec --unit mysql-innodb-cluster/leader leader-get mysql.passwd)
    juju exec --unit mysql-innodb-cluster/leader "mysql -uroot -p${MYSQLPWD} -e \"select uuid,display_name,node,launched_on,created_at,deleted_at from nova.instances where uuid='a2454040-d4ed-4711-8d5e-6b3637f69aa9'\"\\\G"
    
  • /var/log/workloadmgr/workloadmgr-api.log # trilio-wlm
    /var/log/workloadmgr/workloadmgr-cron.log # trilio-wlm
    /var/log/workloadmgr/workloadmgr-filesearch.log # trilio-wlm
    /var/log/workloadmgr/workloadmgr-scheduler.log # trilio-wlm
    /var/log/workloadmgr/workloadmgr-workloads.log # trilio-wlm
    /var/log/dmapi/dmapi.log # trilio-dm-api
    /var/log/nova/tvault-contego.log # trilio-data-mover
    /var/log/tvault-object-store/tvault-object-store.log # trilio-wlm, trilio-data-mover
    
  •  A. On Trilio Appliance source the credentials of cloud administrator

    B. Run this command source /home/stack/myansible/bin/activate

    C. Create cloud admin trust using following command:

     Syntax of command: workloadmgr trust-create [--is_cloud_trust {True,False}] <role_name>
                  Example command:
                  workloadmgr trust-create --is_cloud_trust True admin
                  workloadmgr trust-create --is_cloud_trust True Admin (This command is for canonical based openstack where admin role is Admin)
    
  • To install the latest available workloadmgr package for a Trilio release from the Trilio repository the following steps need to be done:

    Create the Trilio yum repository file /etc/yum.repos.d/trilio.repo
    

    Enter the following details into the repository file:

    [trilio]
    name=Trilio Repository
    baseurl=http://trilio:[email protected]:8283/triliovault-<Trilio-Release>/yum/
    enabled=1
    gpgcheck=0
    

    Install the workloadmgr client issuing the following command:

    For CentOS7: yum install workloadmgrclient
    For CentOS8: yum install python-3-workloadmgrclient-el8
    

    An example installation can be found below:

    [root@controller ~]# cat /etc/yum.repos.d/trilio.repo
    [trilio]
    name=Trilio Repository
    baseurl=http://trilio:[email protected]:8283/triliovault-4.0/yum/
    enabled=1
    gpgcheck=0
    
  •  In case of the Trilio Dashboard being lost it can be resetted as long as SSH access to the appliance is available.

    To reset the password to its default do the following:

    [root@TVM1 ~]# source /home/stack/myansible/bin/activate
    (myansible) [root@TVM1 ~]# cd /home/stack/myansible/lib/python3.6/site-packages/tvault_configurator
    (myansible) [root@TVM1 tvault_configurator]# python recreate_conf.py
    (myansible) [root@TVM1 tvault_configurator]# systemctl restart tvault-config
    

    The dashboard login will be reset to:

    Username: admin

    Password: password

  • On Trilio Appliance run this command

    grep trustee /etc/workloadmgr/workloadmgr.conf
    
  •  Follow these steps on Trilio Vault Appliance

       A. Source the openstack credentials for the admin user

      B. Activate the virtual env by following command:

    source /home/stack/myansible/bin/activate
    

       C. First disable the scheduler of the workload with following command:

    workloadmgr --os-project-id <Project-id> workload-modify <workload-id>                                  --jobschedule  enabled=false
    

     D. Enable the scheduler again using following command:

    workloadmgr --os-project-id <project-id> workload-modify <workload-id> --jobschedule  enabled=true \
     --jobschedule 'start_time'='5:00 PM'  --jobschedule 'interval' : '24 hr' \
     --jobschedule  'Number of days to retain Snapshots' \
     --jobschedule 'retention_policy_value' : '30' \
     --jobschedule 'fullbackup_interval' : '7'
  • A. To enable and disable the Global Job Scheduler Cloud admin user’s Privilege is required. So first step is to source the file containing credentials of Cloud admin User

    B.  Run the following command:  source /home/stack/myansible/bin/activate

    C.Verify What is the status of Global Scheduler using following command:

         workloadmgr get-global-job-scheduler

    D. To disable the Global Job Scheduler, run the following command

          workloadmgr  disable-global-job-scheduler

    E. To enable the Global Job Scheduler run the following command

         Workloadmgr enable-global-job-scheduler

  • To migrate VMs between two OpenStack clouds, use the same NFS/S3 storage or ensure replication at the storage level. Assuming both clouds share the same backup storage, this process involves admin functions. First, identify workloads not originating from the same cloud using:

    workloadmgr workload-get-importworkloads-list

    Then, import the identified workloads to the second cloud with:

    workloadmgr workload-reassign-workloads --new_tenant_id <new_tenant_id> --workload_ids <workload_id> --user_id <user_id> --migrate_cloud True
    

    Finally, restore the workload backup on the second cloud, using selective restore to map the VM to new resources such as network and volume type.

  • Follow these steps for a proper upgrade:First, upgrade your cloud environment.Then, upgrade Trilio according to the specific cloud distribution you are using.

  • Control Plane Services:

    wlm-cron (runs on one node)
    wlm-api (runs on all controller nodes)
    wlm-workloads (runs on all controller nodes)
    wlm-scheduler (runs on all controller nodes)
    trilio-datamover-api (runs on all controller nodes)
    Compute Plane Services:
    trilio-datamover (runs on all compute nodes)
    
  • Trilio is designed to support multi-tenant environments through role-based access control (RBAC). Each tenant can manage their own backups and restores independently, ensuring isolation and security between tenants.

  • Restoring a VM on an external or provider network is an admin-level function. To use Trilio for backing up and restoring VMs on an external network, ensure the Trustee role is set as admin during Trilio installation. Additionally, the user performing the backup and restore must have an admin role in the project.

  • Connecting to and testing a S3 bucket:

    sudo apt install -y awscli # Ubuntu
    sudo yum install -y awscli # RHEL
    #curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" && unzip awscliv2.zip && sudo ./aws/install # if no repos are available
    #sudo pip3 install awscli # this command can also be used if running from a TVM. Using sudo is MANDATORY so we don't touch the virtual environment!
    
    export AWS_ACCESS_KEY_ID=Z3GZYLQN7Jaaaaaaaaab
    export AWS_SECRET_ACCESS_KEY=abcdefghvlkdNzvvzmrkFpd1R5pKg4aoME7IhSXp
    export AWS_DEFAULT_REGION=default
    export AWS_REGION=${AWS_DEFAULT_REGION}
    #export AWS_CA_BUNDLE=./ca-bundle.crt # to specify a CA bundle. To completely bypass SSL check, add --no-verify-ssl to aws commands
    export S3_ENDPOINT_URI=http://ceph.trilio.demo/ # or http://172.22.0.3/
    export S3_TEST_BUCKET_URI=bucket1
    
    dd if=/dev/urandom of=./testimage.img bs=1K count=102400 iflag=fullblock # create a 100MB test image
    aws s3 --endpoint-url $S3_ENDPOINT_URI mb s3://${S3_TEST_BUCKET_URI}
    aws s3 --endpoint-url $S3_ENDPOINT_URI ls | grep ${S3_TEST_BUCKET_URI} # confirm the bucket exists
    aws s3 --endpoint-url $S3_ENDPOINT_URI cp ./testimage.img s3://${S3_TEST_BUCKET_URI}/
    aws s3 --endpoint-url $S3_ENDPOINT_URI ls s3://${S3_TEST_BUCKET_URI} | grep testimage.img # confirm the image is in the bucket
    aws s3 --endpoint-url $S3_ENDPOINT_URI rm s3://${S3_TEST_BUCKET_URI}/testimage.img
    rm -f ./testimage.img
    #aws s3 --endpoint-url $S3_ENDPOINT_URI rb s3://${S3_TEST_BUCKET_URI} # only if this bucket was created only for this purpose. Add --force to forcefully delete all contents
    #Note: if any issues are found while running above aws commands, get more detail by adding the flag --debug
    
  • T4O:

    In NFS, mount points mounted in TVM/Computes in: /var/trilio/triliovault-mounts/<base64_mount_point>/
    In S3/Ceph, mount points are in TVM/Computes in: /var/trilio/triliovault-mounts/
    

    # The mount point might be different in different Trilio versions or editions

    Mount point hash:

    # Trilio Pre-4.2:

    > Converting from mount point to base64:
    echo -n 10.105.105.111:/vol/trilio | base64 
    
    > Converting from base64 to mount point:
    echo MTAuMTA1LjEwNS4xMTE6L3ZvbC90cmlsaW8= | base64 -d
    
    # Trilio 4.2 or higher:
    
    > Converting from mount point to base64:
    echo -n /vol/trilio | base64 
    
    > Converting from base64 to mount point:
    echo L3ZvbC90cmlsaW8= | base64 -d
    
    # There is one directory under the mountpoint with the UUID of the user triliovault, with metadata information # openstack user list --project service
    # The others are named workload_<wid>
    # For containers, in computes, mounts are inside the Trilio container: sudo docker exec -itu root triliovault_datamover bash
    
  • TVM:

    /var/log/workloadmgr/tvault-config.log # logs from the T4O configuration
    /var/log/workloadmgr/ansible-playbook.log # pre-checks from the configurator
    /var/log/workloadmgr/workloadmgr-api.log
    /var/log/workloadmgr/workloadmgr-cron.log
    /var/log/workloadmgr/workloadmgr-filesearch.log
    /var/log/workloadmgr/workloadmgr-scheduler.log
    /var/log/workloadmgr/workloadmgr-workloads.log
    /var/log/workloadmgr/s3vaultfuse.py.log # logs from the S3 plugin
    

    Kolla:

    # Pre-5.x
    /var/log/kolla/triliovault-datamover-api/dmapi.log # controllers
    /var/log/kolla/triliovault-datamover/tvault-contego.log # computes
    /var/log/tvault-object-store/tvault-object-store.log # inside the triliovault-datamover container
    
    # v5.x
    /var/log/kolla/triliovault-wlm-api/triliovault-wlm-api.log # controllers
    /var/log/kolla/triliovault-wlm-api/triliovault-object-store.log # controllers
    /var/log/kolla/triliovault-wlm-cron/triliovault-wlm-cron.log # controllers
    /var/log/kolla/triliovault-wlm-cron/triliovault-object-store.log # controllers
    /var/log/kolla/triliovault-wlm-scheduler/triliovault-wlm-scheduler.log # controllers
    /var/log/kolla/triliovault-wlm-scheduler/triliovault-object-store.log # controllers
    /var/log/kolla/triliovault-wlm-workloads/triliovault-wlm-workloads.log # controllers
    /var/log/kolla/triliovault-wlm-workloads/triliovault-object-store.log # controllers
    /var/log/kolla/triliovault-datamover-api/triliovault-datamover-api.log # controllers
    /var/log/kolla/triliovault-datamover/triliovault-datamover.log # computes
    /var/log/kolla/triliovault-datamover/triliovault-object-store.log # computes
    

    OSA:

    sudo lxc-attach `sudo lxc-ls -f | grep dmapi | awk '{print $1}'` -- journalctl --no-pager # controllers. Control tail and fork with --lines <nr_lines> and -f
    /openstack/log/*dmapi*/dmapi/dmapi.log # controllers
    /var/log/tvault-contego/tvault-contego.log # computes
    /var/log/tvault-object-store/tvault-object-store.log # computes
    
  • 1. Grab the restore ID

    workloadmgr snapshot-list --workload_id a0e61dc0-14eb-4e43-8c4f-ae25ac4fa8c0
    workloadmgr restore-list --snapshot_id 546b542a-9fba-4484-848d-da730b4e46c6
    

    2. In the TVM logs, get the ID of the instance that was trying to spin up:

    sudo cat /var/log/workloadmgr/workloadmgr-workloads.log* | grep CopyBackupImageToVolume.execute.*ENTER | grep <restore_id> | awk '{print $9}' | cut -f1 -d"," | sort | uniq

    3. Grab the host from the OpenStack DB (inside the OpenStack controller):

    # Ansible
    sudo lxc-attach `sudo lxc-ls -f | grep galera | awk '{print $1}'` -- mysql -u root -p`sudo cat /etc/openstack_deploy/user_secrets.yml | grep ^galera_root_password | awk '{print $NF}'` -e "select uuid,display_name,node,launched_on,created_at,deleted_at from nova.instances where uuid='a2454040-d4ed-4711-8d5e-6b3637f69aa9'\G"
    
    # Kolla
    sudo docker exec -itu root `sudo docker ps | grep mariadb | awk '{print $NF}'` mysql -u root -p`sudo cat /etc/kolla/passwords.yml | grep ^database_password | awk '{print $NF}'` -e "select uuid,display_name,node,launched_on,created_at,deleted_at from nova.instances where uuid='a2454040-d4ed-4711-8d5e-6b3637f69aa9'\G
    
  • sudo lsmod | grep nbd # if not loaded, load it: sudo modprobe nbd max_part=16 # this will only work if the module 'nbd' is built into the kernel: sudo ls /sys/module | grep -w nbd
    sudo qemu-nbd --connect=/dev/nbd0 /var/trilio/triliovault-mounts/L21udC90dmF1bHQva29sbGF0YXJnZXRuZnM=/workload_847dbe3f-6ff7-4ead-ab06-186713c3d53f/snapshot_91d7c684-7296-44cd-bec6-92eefc73f58b/vm_id_a2f706a3-d202-4300-8bcb-d667230a3091/vm_res_id_b465433c-1037-404a-81f7-264098795365_vda/1940843e-e6b9-4555-afc9-b8fd50e54226
    sudo fdisk /dev/nbd0 -l
    sudo mkdir /mnt/qcow_disk
    sudo mount /dev/nbd0p1 /mnt/qcow_disk
    sudo ls -lrt /mnt/qcow_disk
    sudo umount /mnt/qcow_disk
    sudo rmdir /mnt/qcow_disk
    sudo qemu-nbd --disconnect /dev/nbd0
    #sudo rmmod nbd # only if you want to unload the module
    
  • Run, from inside the TVM (or the WLM-API container):

    CONFIG_FILE=/etc/workloadmgr/workloadmgr.conf
    #CONFIG_FILE="/etc/triliovault-wlm/triliovault-wlm*.conf" # v5.x
    unset "${!OS_@}" && openstack versions show --os-username `sudo cat ${CONFIG_FILE} | grep -w ^admin_user | awk -F"=" '{print $NF}'` --os-password `sudo cat ${CONFIG_FILE} | grep -w ^admin_password | awk -F"=" '{print $NF}'` --os-project-name `sudo cat ${CONFIG_FILE} | grep -w ^admin_tenant_name | awk -F"=" '{print $NF}'` --os-project-domain-id `sudo cat ${CONFIG_FILE} | grep -w ^project_domain_id | awk -F"=" '{print $NF}'` --os-user-domain-id `sudo cat ${CONFIG_FILE} | grep -w ^user_domain_id | awk -F"=" '{print $NF}'` --os-auth-url `sudo cat ${CONFIG_FILE} | grep -w ^auth_url | awk -F"=" '{print $NF}'` --insecure --debug
    unset "${!OS_@}" && openstack versions show --os-username `sudo cat ${CONFIG_FILE} | grep -w ^username | awk -F"=" '{print $NF}'` --os-password `sudo cat ${CONFIG_FILE} | grep -w ^password | awk -F"=" '{print $NF}'` --os-project-name `sudo cat ${CONFIG_FILE} | grep -w ^project_name | awk -F"=" '{print $NF}'` --os-project-domain-id `sudo cat ${CONFIG_FILE} | grep -w ^project_domain_id | awk -F"=" '{print $NF}'` --os-user-domain-id `sudo cat ${CONFIG_FILE} | grep -w ^user_domain_id | awk -F"=" '{print $NF}'` --os-auth-url `sudo cat ${CONFIG_FILE} | grep -w ^auth_url | awk -F"=" '{print $NF}'` --insecure --debug
    

    If the OpenStack service versions are listed, credentials are OK.

  • Control Plane Services:

    wlm-cron (runs on one node)
    wlm-api (runs on all controller nodes)
    wlm-workloads (runs on all controller nodes)
    wlm-scheduler (runs on all controller nodes)
    trilio-datamover-api (runs on all controller nodes)
    Compute Plane Services:
    trilio-datamover (runs on all compute nodes)
    
  • juju status | grep trilio | grep charm
    juju ssh trilio-wlm/leader "sudo dpkg -l | grep -Ei 'trilio|tvault|workloadmgr|s3|dmapi|dm_api|datamover|data_mover|contego'"
    juju ssh trilio-dm-api/leader "sudo dpkg -l | grep -Ei 'trilio|tvault|workloadmgr|s3|dmapi|dm_api|datamover|data_mover|contego'"
    juju ssh trilio-data-mover/leader "sudo dpkg -l | grep -Ei 'trilio|tvault|workloadmgr|s3|dmapi|dm_api|datamover|data_mover|contego'"
    
  • sudo lxc-attach `sudo lxc-ls -f | grep dmapi | awk '{print $1}'` -- bash -c 'sudo dpkg -l | grep dmapi' # controllers
    sudo lxc-attach `sudo lxc-ls -f | grep horizon | awk '{print $1}'` -- bash -c 'sudo dpkg -l | grep -E "tvault|workloadmgr|contego"' # controllers
    sudo dpkg -l | grep -Ei 'trilio|tvault|workloadmgr|s3|dmapi|dm_api|datamover|data_mover|contego' # computes
    
  • Simple FAQ Content
  • Simple FAQ Content - 2