A simple guide to replacing a failed Control Plane node

May 11, 2023Kibria Ghulam, Marcklyvens Morency22-minute read

Purpose

The purpose of this procedure is to show how to replace a failed Control Plane node in a Bare metal OpenShift cluster (3+0 or 3+N ) in a simple way. This methodology is based on IPI and will allow you to replace a failed Control node quickly.

Prerequisites

Existing Bare metal cluster installed either with IPI or ABI or Asisted Installer using OCP >= 4.12.1.
Be surethat all the required DNS records exist.
You have access to the cluster as a user with the cluster-admin role.
You have taken an etcd backup.
Bare metal Operator is available ($ oc get clusteroperator baremetal).
Server boot mode set to UEFI and Redfish multimedia is supported.

Replacing a Master node

Here we will be replacing master-2 with master-x. To simulate the node faillure we will shutdown master-2.

Pre-check validation

$ oc get nodes
NAME       STATUS     ROLES                         AGE   VERSION
master-0   Ready      control-plane,master,worker   31m   v1.25.8+27e744f
master-1   Ready      control-plane,master,worker   58m   v1.25.8+27e744f
master-2   NotReady   control-plane,master,worker   58m   v1.25.8+27e744f

Control Node replacement

Please check the Official document for details of how to remove an unhealthy etcd member can be found here Remove-Unhealth-ETCD

Remove Unhealthy ETCD Master-1 Member

Checking ETCD Member Status

$ oc -n openshift-etcd rsh etcd-master-0 etcdctl member list -w table
+------------------+---------+----------+----------------------------+----------------------------+------------+
|        ID        | STATUS  |   NAME   |         PEER ADDRS         |        CLIENT ADDRS        | IS LEARNER |
+------------------+---------+----------+----------------------------+----------------------------+------------+
|  c300d358075445b | started | master-0 | https://192.168.24.87:2380 | https://192.168.24.87:2379 |      false |
| 1a7b6f4c3aac9be1 | started | master-1 | https://192.168.24.88:2380 | https://192.168.24.88:2379 |      false |
| 6fd2f8909c811461 | started | master-2 | https://192.168.24.86:2380 | https://192.168.24.86:2379 |      false |
+------------------+---------+----------+----------------------------+----------------------------+------------+
$ oc -n openshift-etcd rsh etcd-master-0 etcdctl endpoint health
{"level":"warn","ts":"2023-04-24T17:11:59.984Z","logger":"client","caller":"v3@v3.5.6/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc00028c000/192.168.24.86:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
https://192.168.24.88:2379 is healthy: successfully committed proposal: took = 7.12757ms
https://192.168.24.87:2379 is healthy: successfully committed proposal: took = 7.216856ms
https://192.168.24.86:2379 is unhealthy: failed to commit proposal: context deadline exceeded
Error: unhealthy cluster
command terminated with exit code 1

Note: Take note master-2 member-ID for next steps

Remove ETCD master-2 Member-ID

$ oc -n openshift-etcd rsh etcd-master-0 
sh-4.4# etcdctl member list
c300d358075445b, started, master-0, https://192.168.24.87:2380, https://192.168.24.87:2379, false
1a7b6f4c3aac9be1, started, master-1, https://192.168.24.88:2380, https://192.168.24.88:2379, false
6fd2f8909c811461, started, master-2, https://192.168.24.86:2380, https://192.168.24.86:2379, false
sh-4.4# etcdctl member remove 6fd2f8909c811461
Member 6fd2f8909c811461 removed from cluster c413f45f7dfe9590

Check ETCD Member Status Again
Note: Make sure master-x ETCD member is no longer shown on the status

sh-4.4# etcdctl member list -w table
+------------------+---------+----------+----------------------------+----------------------------+------------+
|        ID        | STATUS  |   NAME   |         PEER ADDRS         |        CLIENT ADDRS        | IS LEARNER |
+------------------+---------+----------+----------------------------+----------------------------+------------+
|  c300d358075445b | started | master-0 | https://192.168.24.87:2380 | https://192.168.24.87:2379 |      false |
| 1a7b6f4c3aac9be1 | started | master-1 | https://192.168.24.88:2380 | https://192.168.24.88:2379 |      false |
+------------------+---------+----------+----------------------------+----------------------------+------------+

List The Old Secrets for Unhealthy Master-2

$ oc get secret -n openshift-etcd | grep master-2
etcd-peer-master-2              kubernetes.io/tls                     2      56m
etcd-serving-master-2           kubernetes.io/tls                     2      56m
etcd-serving-metrics-master-2   kubernetes.io/tls                     2      56m

Remove the old secrets for the unhealthy etcd member that was removed

$ oc get secrets -n openshift-etcd|grep master-2 |awk '{print $1}'|xargs oc -n openshift-etcd delete secrets
secret "etcd-peer-master-2" deleted
secret "etcd-serving-master-2" deleted
secret "etcd-serving-metrics-master-2" deleted

Check ETCD Status

$ oc get pods -n openshift-etcd | grep -v etcd-quorum-guard | grep etcd
etcd-master-0              5/5     Running     0          54m
etcd-master-2               2/5     NotReady    0          50m
etcd-master-1               5/5     Running     0          52m

Force the etcd redeployment

$ oc patch etcd cluster -p='{"spec": {"forceRedeploymentReason": "single-master-recovery-'"$( date --rfc-3339=ns )"'"}}' --type=merge

Delete Machine and BMH of the failed Master

$ oc delete machine master-2 -n n openshift-machine-api
$ oc delete bmh master-2 -n openshift-machine-api

Prepare to Delete master-2 Node

Check PODs status on Master-2 Node

$ oc get po -A -o wide|grep master-2'

It should be PODs still allocated to master-2, then follow next steps to clean them up.

Delete Master-2 Node

$ oc delete node master-2

Note: Please check this status again to make sure no more PODs allocated / running on master-2 anymore.
And also make sure that no more pods on master-2

$ oc get po -o wide -A|grep master-2|wc -l
0

$ oc get nodes
NAME       STATUS   ROLES                         AGE   VERSION
master-0   Ready    control-plane,master,worker   47m   v1.25.8+27e744f
master-1   Ready    control-plane,master,worker   75m   v1.25.8+27e744f

Now we are ready to add the new control node

Preparing the bare metal node

To replace the failed master node, you can used either static or Dynamic IP configuration. When replacing a master node using a DHCP server, the node must have a DHCP reservation.

1- Make sure the node is poweroff ( new master)

2- Validate that you have the OC version that match the cluster version

3- Retrieve the user name and password of the bare metal node’s baseboard management controller. Then, create base64 strings from the user name and password:

echo -ne "root" | base64
echo -ne "password" | base64

Create a configuration file for the bare metal node**

$  cat <<EOF | oc apply -f -
apiVersion: v1
kind: Secret
metadata:
  name: control-plane-3-bmc-secret 
  namespace: openshift-machine-api
data:
  username: cm9fdd=
  password: Y2Fsd==
type: Opaque
---
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
  name: master-x
  namespace: openshift-machine-api
spec:
  automatedCleaningMode: disabled
  bmc:
    address: idrac-virtualmedia://192.168.24.159/redfish/v1/Systems/System.Embedded.1    #this is for dell server , for HP or other vendor check virtual media path
    credentialsName: control-plane-3-bmc-secret 
    disableCertificateVerification: True
  bootMACAddress: b8:ce:f6:56:a9:ea
  bootMode: UEFI
  externallyProvisioned: false
  hardwareProfile: unknown
  online: true
  rootDeviceHints:
    deviceName: /dev/sdb
EOF

Once the bare metal host got created, for master node you need to create the Machine

for the new master

$  cat <<EOF | oc apply -f -
apiVersion: machine.openshift.io/v1beta1
kind: Machine
metadata:
  annotations:
    metal3.io/BareMetalHost: openshift-machine-api/master-x 
  labels:
    machine.openshift.io/cluster-api-cluster: abi-4c7mt
    machine.openshift.io/cluster-api-machine-role: master
    machine.openshift.io/cluster-api-machine-type: master
  name: abi-4c7mt-master-x
  namespace: openshift-machine-api
spec:
  metadata: {}
  providerSpec:
    value:
      apiVersion: baremetal.cluster.k8s.io/v1alpha1
      customDeploy:
        method: install_coreos
      hostSelector: {}
      image:
        checksum: ""
        url: ""
      kind: BareMetalMachineProviderSpec
      metadata:
        creationTimestamp: null
      userData:
        name: master-user-data-managed
EOF

Bmh object get created and will transition to different status

Inspecting
Available
Provisioning
Provisionned

$ oc get bmh 
NAME       STATE        CONSUMER             ONLINE   ERROR   AGE
master-0   unmanaged    abi-4c7mt-master-0   true             93m
master-1   unmanaged    abi-4c7mt-master-1   true             93m
master-x   inspecting   abi-4c7mt-master-x   true             2m27s


Keep monitoring the bmh until status changed to “available”. In the meantime , Server will get booted using Virtual Media to install RHCOS.

$ oc get bmh master-x
NAME       STATE        CONSUMER             ONLINE   ERROR   AGE
master-x   inspecting   abi-4c7mt-master-x   true             3m13s

$ oc get machine
NAME                 PHASE          TYPE   REGION   ZONE   AGE
abi-4c7mt-master-0   Running                               99m
abi-4c7mt-master-1   Running                               99m
abi-4c7mt-master-x   Provisioning                          8m22s

$ oc get bmh master-x
NAME       STATE          CONSUMER             ONLINE   ERROR   AGE
master-x   provisioning   abi-4c7mt-master-x   true             11m

Node will get rebooted.Keep monitoring the BMH until status changed to “provisioned”

$ oc get bmh master-x
NAME       STATE         CONSUMER             ONLINE   ERROR   AGE
master-x   provisioned   abi-4c7mt-master-x   true             18m

$ oc get machine
NAME                 PHASE         TYPE   REGION   ZONE   AGE
abi-4c7mt-master-0   Running                              110m
abi-4c7mt-master-1   Running                              110m
abi-4c7mt-master-x   Provisioned                          18m

After two reboots new master node ( master-x here) should join the cluster automatically ( no CSR needs to be approved).

$ oc get bmh master-x
NAME       STATE         CONSUMER             ONLINE   ERROR   AGE
master-x   provisioned   abi-4c7mt-master-x   true             32m
$ oc get machine
NAME                 PHASE     TYPE   REGION   ZONE   AGE
abi-4c7mt-master-0   Running                          123m
abi-4c7mt-master-1   Running                          123m
abi-4c7mt-master-x   Running                          32m

Validation

Validate that all nodes are ready and that cluster is stable

$ oc get nodes
NAME       STATUS   ROLES                         AGE     VERSION
master-0   Ready    control-plane,master,worker   87m     v1.25.8+27e744f
master-1   Ready    control-plane,master,worker   114m    v1.25.8+27e744f
master-x   Ready    control-plane,master,worker   2m19s   v1.25.8+27e744f


$ oc get machine
NAME                 PHASE     TYPE   REGION   ZONE   AGE
abi-4c7mt-master-0   Running                          124m
abi-4c7mt-master-1   Running                          124m
abi-4c7mt-master-x   Running                          33m

$ oc get bmh 
NAME       STATE         CONSUMER             ONLINE   ERROR   AGE
master-0   unmanaged     abi-4c7mt-master-0   true             125m
master-1   unmanaged     abi-4c7mt-master-1   true             125m
master-x   provisioned   abi-4c7mt-master-x   true             33m



$ oc get co
NAME                                       VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
authentication                             4.12.14   True        False         False      94m     
baremetal                                  4.12.14   True        False         False      117m    
cloud-controller-manager                   4.12.14   True        False         False      126m    
cloud-credential                           4.12.14   True        False         False      137m    
cluster-autoscaler                         4.12.14   True        False         False      117m    
config-operator                            4.12.14   True        False         False      118m    
console                                    4.12.14   True        False         False      97m     
control-plane-machine-set                  4.12.14   True        False         False      116m    
csi-snapshot-controller                    4.12.14   True        False         False      117m    
dns                                        4.12.14   True        False         False      115m    
etcd                                       4.12.14   True        False         False      116m    
image-registry                             4.12.14   True        False         False      106m    
ingress                                    4.12.14   True        False         False      114m    
insights                                   4.12.14   True        False         False      103m    
kube-apiserver                             4.12.14   True        False         False      97m     
kube-controller-manager                    4.12.14   True        False         False      115m    
kube-scheduler                             4.12.14   True        False         False      114m    
kube-storage-version-migrator              4.12.14   True        False         False      117m    
machine-api                                4.12.14   True        False         False      114m    
machine-approver                           4.12.14   True        False         False      117m    
machine-config                             4.12.14   True        False         False      53m     
marketplace                                4.12.14   True        False         False      117m    
monitoring                                 4.12.14   True        False         False      106m    
network                                    4.12.14   True        False         False      117m    
node-tuning                                4.12.14   True        False         False      116m    
openshift-apiserver                        4.12.14   True        False         False      112m    
openshift-controller-manager               4.12.14   True        False         False      113m    
openshift-samples                          4.12.14   True        False         False      109m    
operator-lifecycle-manager                 4.12.14   True        False         False      116m    
operator-lifecycle-manager-catalog         4.12.14   True        False         False      116m    
operator-lifecycle-manager-packageserver   4.12.14   True        False         False      112m    
service-ca                                 4.12.14   True        False         False      118m    
storage                                    4.12.14   True        False         False      118m    

$ oc get clusterversions.config.openshift.io 
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.12.14   True        False         92m     Cluster version is 4.12.14

Checking ETCD Member Status

$ oc get pods -n openshift-etcd
NAME                          READY   STATUS      RESTARTS   AGE
etcd-guard-master-0           1/1     Running     0          95m
etcd-guard-master-1           1/1     Running     0          114m
etcd-guard-master-x           1/1     Running     0          10m
etcd-master-0                 4/4     Running     0          7m15s
etcd-master-1                 4/4     Running     0          9m7s
etcd-master-x                 4/4     Running     0          5m11s
installer-10-master-x         0/1     Completed   0          13m
installer-11-master-0         0/1     Completed   0          8m22s
installer-11-master-1         0/1     Completed   0          10m
installer-11-master-x         0/1     Completed   0          6m23s
installer-7-master-0          0/1     Completed   0          98m
installer-9-master-0          0/1     Completed   0          91m
installer-9-master-1          0/1     Completed   0          92m
revision-pruner-10-master-0   0/1     Completed   0          13m
revision-pruner-10-master-1   0/1     Completed   0          13m
revision-pruner-10-master-x   0/1     Completed   0          13m
revision-pruner-11-master-0   0/1     Completed   0          10m
revision-pruner-11-master-1   0/1     Completed   0          10m
revision-pruner-11-master-x   0/1     Completed   0          10m
revision-pruner-7-master-0    0/1     Completed   0          98m
revision-pruner-7-master-1    0/1     Completed   0          98m
revision-pruner-8-master-0    0/1     Completed   0          95m
revision-pruner-8-master-1    0/1     Completed   0          95m
revision-pruner-9-master-0    0/1     Completed   0          94m
revision-pruner-9-master-1    0/1     Completed   0          94m
revision-pruner-9-master-x    0/1     Completed   0          14m


$ oc -n openshift-etcd rsh etcd-master-0 etcdctl member list -w table
+------------------+---------+----------+----------------------------+----------------------------+------------+
|        ID        | STATUS  |   NAME   |         PEER ADDRS         |        CLIENT ADDRS        | IS LEARNER |
+------------------+---------+----------+----------------------------+----------------------------+------------+
|  c300d358075445b | started | master-0 | https://192.168.24.87:2380 | https://192.168.24.87:2379 |      false |
| 1a7b6f4c3aac9be1 | started | master-1 | https://192.168.24.88:2380 | https://192.168.24.88:2379 |      false |
| b24ac36103e976e3 | started | master-x | https://192.168.24.91:2380 | https://192.168.24.91:2379 |      false |
+------------------+---------+----------+----------------------------+----------------------------+------------+
$ oc -n openshift-etcd rsh etcd-master-0 etcdctl endpoint health
https://192.168.24.87:2379 is healthy: successfully committed proposal: took = 8.642798ms
https://192.168.24.91:2379 is healthy: successfully committed proposal: took = 8.640762ms
https://192.168.24.88:2379 is healthy: successfully committed proposal: took = 8.938068ms

About the authors

Kibria Ghulam

Marcklyvens Morency

Browse by channel

Explore all channels