Introduction
Red Hat Advanced Cluster Management for Kubernetes (RHACM) defines two main types of clusters: hub clusters and managed clusters.
The hub cluster is the main cluster with RHACM installed on it. You can create, manage, and monitor other Kubernetes clusters with the hub cluster. The managed clusters are Kubernetes clusters that are managed by the hub cluster. You can create some clusters by using the RHACM hub cluster, and you can also import existing clusters to be managed by the hub cluster. Since the hub cluster manages the cluster fleet, it is vital that there is a business continuity scenario built in so that when an unexpected event causes a hub cluster to fail, the cluster fleet can be managed by a new hub cluster.
The RHACM backup and restore feature, available starting with version 2.5, offers support for building a Disaster Recovery solution to recover the hub cluster when it fails. There is a shortcoming for this feature though: only managed clusters created using the Hive API are automatically connected to the restored hub cluster. Imported managed clusters must be manually reconnected on the new hub cluster.
RHACM 2.7 provides a solution to automatically import managed clusters when restoring on a new hub cluster.
The purpose of this blog is to provide a walk-through on how to enable and make use of the solution available with RHACM 2.7 to automatically import managed clusters on a restore hub cluster operation. Before showing how to use the auto import feature available with RHACM 2.7, let's see why this approach is needed in the first place.
Why imported clusters must be manually reimported after restore
When the backup data is moved to another hub cluster, only Hive managed clusters are automatically connected with the new hub cluster. Hive clusters are managed clusters created on the hub cluster using the Create cluster action available from the Clusters tab in the console.
Managed clusters connected with the initial hub cluster by using the Import cluster action appear as Pending Import when the hub cluster data is restored on a new hub cluster, and the clusters must be manually imported back on the new hub cluster.
Hive managed clusters are automatically connected with the new hub cluster because Hive stores the managed cluster kubeconfig
in the managed cluster namespace on the hub cluster, and this is being backed up and restored on the new hub cluster. The import controller updates the bootstrap kubeconfig
on the managed cluster using this restored configuration. This information is only available for managed clusters created by using the Hive API and is not available for imported clusters.
The workaround provided with RHACM 2.5 and RHACM 2.6 for reconnecting imported clusters with the new hub cluster is to manually create the auto-import-secret
after the restore operation is started. The auto-import-secret
must be created on the restore hub cluster in the managed cluster namespace, for each cluster in Pending Import state. This auto-import-secret
must use a kubeconfig
or token with enough permissions for the import component to start the auto import on the new hub cluster.
For a large number of imported managed clusters, this is a very tedious operation since it is ran manually for each managed cluster. It increases the Recovery Time Objective time and requires the user, who runs the restore operation, to establish access between each managed cluster and a token that can be used to connect with the managed cluster. This token must have a klusterlet
role binding or a role with equivalent permissions.
Automatically reconnecting managed clusters with RHACM 2.7
Continue reading the new solution for automatically connecting imported clusters to the new hub cluster by using the ManagedServiceAccount
feature, available with the backup and restore component in RHACM 2.7. The following sections show you how to enable this feature with RHACM 2.7 and explain possible limitations.
How the automatic connection works
The backup controller available with RHACM 2.7 uses the ManagedServiceAccount component on the primary hub cluster to create a token for each of the imported managed clusters.
This token is backed up in each managed cluster namespace and is set to use a klusterlet-bootstrap-kubeconfig
ClusterRole
binding, which allows the token to be used when importing the managed cluster with the auto import secret. The klusterlet-bootstrap-kubeconfig
ClusterRole
can only get or update the bootstrap-hub-kubeconfig
secret, so there is limited access to the managed cluster.
When the activation data is restored on the new hub cluster, the restore controller runs a post restore operation and looks for all managed clusters in the Pending import state. For these managed clusters, it checks if there is a valid token generated by the ManagedServiceAccount
and, if found, creates an auto-import-secret
by using this token. As a result, the cluster import component tries to reconnect the managed cluster, and if the cluster is accessible, the operation is successful.
Automatic import value
When the hub cluster backup data is restored on a new hub cluster, all managed clusters are automatically connected with the new hub cluster.
Prerequisites
See the following prerequisites to follow along in this blog.
For both active and passive hub clusters:
-
RHACM version 2.7 or later must be installed on your hub cluster. See the following screen capture:
-
Enable the
ManagedServiceAccount
component onMultiClusterEngine
by editing theMultiClusterEngine
resource and settingenabled: true
for themanagedserviceaccount-preview
component. See the following exmaple:apiVersion: multicluster.openshift.io/v1
kind: MultiClusterEngine
metadata:
name: multiclusterhub
spec:
overrides:
components:
- enabled: true
name: managedserviceaccount-preview -
Enable the
cluster-backup
Operator on the hub cluster. Edit theMultiClusterHub
resource and setenabled: true
for thecluster-backup
component. This also installs theOADP operator
in theopen-cluster-management-backup
namespace. See the following example:apiVersion: operator.open-cluster-management.io/v1
kind: MultiClusterHub
spec:
overrides:
components:
- enabled: true
name: cluster-backup -
You must create the
DataProtectionApplication
resource in theopen-cluster-management-backup
namespace and point to a valid storage location for backups.
Enabling the automatic import feature on active hub cluster
To enable the automatic import feature, set the useManagedServiceAccount
property to true
when creating the BackupSchedule.cluster.open-cluster-management.io
resource on the active hub cluster. See the following example:
apiVersion: cluster.open-cluster-management.io/v1beta1
kind: BackupSchedule
metadata:
name: schedule-acm-msa
spec:
veleroSchedule: 0 */1 * * *
veleroTtl: 240h
useManagedServiceAccount: true
Once the useManagedServiceAccount
is set to true
, the backup controller will start processing imported managed clusters and for each of them:
-
Creates a
ManagedClusterAddon
namedmanaged-serviceaccount
. -
Creates a
ManagedServiceAccount
resource namedauto-import-account
and sets the token validity as defined by theBackupSchedule
. -
The
ManagedServiceAccount
resource is processed by theManagedClusterAddon
which triggers on the managed cluster the creation of a token with the same name. This token is pushed back on the hub under the managed cluster namespace.Managed Service Account token on managed cluster:
Note that the token is created only if the managed cluster is accessible. If the managed cluster is not accessible at the time the
ManagedServiceAccount
is created, the token is created at a later time when the managed cluster becomes available. This hub cluster secret gets backed up.Managed Service Account token on hub cluster:
- For each of the
ManagedServiceAccount
resources, the backup controller creates aManifestWork
used to setup on the managed cluster, aklusterlet-bootstrap-kubeconfig
RoleBinding
for theManagedServiceAccount
token. Theklusterlet-bootstrap-kubeconfig
ClusterRole
can only get or update thebootstrap-hub-kubeconfig
secret. This role is going to be used in a backup restore post operation, to auto import the managed cluster on the restored hub cluster.
Managed Service Account role binding on managed cluster:
Notes:
-
You can disable the automatic import cluster feature at any time by setting the
useManagedServiceAccount
option tofalse
on theBackupSchedule
resource. Removing the property has the same result since the default value is set tofalse
.-
When you disable the automatic import cluster feature, the backup controller removes the following resources created:
ManagedClusterAddon
,ManagedServiceAccount
andManifestWork
, which in turn will delete the auto import token, on the hub cluster and managed cluster:apiVersion: cluster.open-cluster-management.io/v1beta1
kind: BackupSchedule
metadata:
name: schedule-acm-msa
spec:
veleroSchedule: 0 */1 * * *
veleroTtl: 240h
useManagedServiceAccount: false
-
-
The
ManagedServiceAccount
auto-import-account
token validity duration is automatically set to be twice the value ofveleroTtl
, to maximize the chance of the token being valid for all backups storing the token for their entire lifecycle. You can choose to change this value if you want to control how long a token should be valid, but keep in mind that this could result in producing backups with tokens set to expire during the lifecycle of the backup. Use themanagedServiceAccountTTL
property to change the token TTL:apiVersion: cluster.open-cluster-management.io/v1beta1
kind: BackupSchedule
metadata:
name: schedule-acm-msa
spec:
veleroSchedule: 0 */2 * * *
veleroTtl: 120h
useManagedServiceAccount: true
managedServiceAccountTTL: 2h
- For each of the
Automatically reconnect imported clusters on restore hub cluster
The backup data is restored on the new hub cluster using a Restore
resource, as shown in the following example:
apiVersion: cluster.open-cluster-management.io/v1beta1
kind: Restore
metadata:
name: restore-acm
namespace: open-cluster-management-backup
spec:
cleanupBeforeRestore: CleanupRestored
veleroManagedClustersBackupName: latest
veleroCredentialsBackupName: latest
veleroResourcesBackupName: latest
When the managed cluster backup data is restored on the new hub cluster, the restore controller runs a post restore operation and looks for all managed clusters in Pending import state.
For these managed clusters, it checks whether there is a valid auto-import-account
token under the managed cluster namespace on the new hub. If such token is found, the post restore routine creates an auto-import-secret
using this token.
As a result, the cluster import component tries to reconnect the managed cluster and if the cluster is accessible the operation is successful.
You should see the following status message for the Restore resource if the post restore operation has created an auto-import-secret
secret, triggering the auto import operation for a managed cluster in Pending Import
state:
apiVersion: cluster.open-cluster-management.io/v1beta1
kind: Restore
metadata:
name: restore-acm
namespace: open-cluster-management-backup
spec:
cleanupBeforeRestore: CleanupRestored
veleroManagedClustersBackupName: latest
veleroCredentialsBackupName: latest
veleroResourcesBackupName: latest
status:
lastMessage: Velero restores have run to completion
messages:
- Created auto-import-secret for managed cluster (vb-managed-cls-1)
phase: Finished
veleroCredentialsRestoreName: restore-acm-acm-credentials-schedule-20221021133542
veleroManagedClustersRestoreName: restore-acm-acm-managed-clusters-schedule-20221021133542
veleroResourcesRestoreName: restore-acm-acm-resources-schedule-20221021133542
Limitations with the automatic import feature
There are a set of limitations with the above approach which could result in the managed cluster not being auto imported when moving to a new hub. These are the situations that can result in the managed cluster not being imported:
-
Since the automatic import operation is making use of the cluster import feature using the auto import secret, it is required that the hub is able to access the managed cluster and run the cluster import operation.
-
Since the
auto-import-secret
created on restore uses theManagedServiceAccount
token to connect to the managed cluster, the managed cluster must also provide the kubeapiserver
information. Theapiserver
must be set on theManagedCluster
resource as in the sample below. Only OCP clusters have thisapiserver
setup automatically when the cluster is imported on the hub. For any other type of managed clusters, such as EKS clusters, this information must be set manually by the user, otherwise the automatic import feature will ignore these clusters and they stay inPending Import
when moved to the restore hub cluster:apiVersion: cluster.open-cluster-management.io/v1
kind: ManagedCluster
metadata:
name: managed-cluster-name
spec:
hubAcceptsClient: true
leaseDurationSeconds: 60
managedClusterClientConfigs:
url: <apiserver> -
The backup controller is regularly looking for imported managed clusters and it creates the ManagedServiceAccount resource under the managed cluster namespace as soon as such managed cluster is found. This should trigger a token creation on the managed cluster. If the managed cluster is not accessible at the time this operation is executed though, for example the managed cluster is hibernating or is down, the
ManagedServiceAccount
is unable to create the token. As a result, if a hub backup is run at this time, the backup will not contain a token to auto import the managed cluster. -
It is possible for a
ManagedServiceAccount
secret to not be included in a backup if the backup schedule runs before the backup label is set on theManagedServiceAccount
secret.ManagedServiceAccount
secrets don't have thecluster.open-cluster-management.io/backup
label set on creation. For this reason, the backup controller looks regularly forManagedServiceAccount
secrets under the managed clusters namespaces, and adds the backup label if not found. -
If the
auto-import-account
secret token is valid and is backed up but the restore operation is run at a time when the token available with the backup has already expired, the auto import operation fails. In this case, therestore.cluster.open-cluster-management.io
resource status should report the invalid token issue for each managed cluster in this situation.
Conclusion
This blog describes how to use the cluster backup and restore operator available with RHACM 2.7 to automatically reconnect imported managed clusters to the new hub after a restore operation. It shows how to enable the automatic connect feature and how it works.
References
- Backup and Restore Hub Clusters with Red Hat Advanced Cluster Management for Kubernetes - blog
- RHACM Backup and Restore imported managed clusters with RHACM 2.6 documentation
- Importing the cluster with the auto import secret
- RHACM Backup and Restore automatic import of managed clusters with RHACM 2.7 documentation
- ManagedServiceAccount add-on framework
About the authors
More like this
Browse by channel
Automation
The latest on IT automation for tech, teams, and environments
Artificial intelligence
Updates on the platforms that free customers to run AI workloads anywhere
Open hybrid cloud
Explore how we build a more flexible future with hybrid cloud
Security
The latest on how we reduce risks across environments and technologies
Edge computing
Updates on the platforms that simplify operations at the edge
Infrastructure
The latest on the world’s leading enterprise Linux platform
Applications
Inside our solutions to the toughest application challenges
Original shows
Entertaining stories from the makers and leaders in enterprise tech
Products
- Red Hat Enterprise Linux
- Red Hat OpenShift
- Red Hat Ansible Automation Platform
- Cloud services
- See all products
Tools
- Training and certification
- My account
- Customer support
- Developer resources
- Find a partner
- Red Hat Ecosystem Catalog
- Red Hat value calculator
- Documentation
Try, buy, & sell
Communicate
About Red Hat
We’re the world’s leading provider of enterprise open source solutions—including Linux, cloud, container, and Kubernetes. We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.
Select a language
Red Hat legal and privacy links
- About Red Hat
- Jobs
- Events
- Locations
- Contact Red Hat
- Red Hat Blog
- Diversity, equity, and inclusion
- Cool Stuff Store
- Red Hat Summit