OpenShift on OpenStack: Using multiple Nova and Cinder availability zones

29 novembre 2018Eduardo Minguez

Following the “OpenShift on OpenStack: Availability Zones” series, in the first part we introduced OpenStack AZs and presented different OpenShift deployment options regarding AZs.

In this post we will explain the ‘best case scenario’, using OpenShift on OpenStack with multiple Nova AZs and multiple Cinder AZs where the AZ names match.

PART II - Scenario One (Recommended): Multi Nova and Cinder AZs

The following scenario consists of:

3 Nova AZs (AZ1, AZ2, AZ3)
3 Cinder AZs (AZ1, AZ2, AZ3)

For demonstration, we will use the asb-etcd pod as an example as it is created at installation time. It is a pod that requires a volume to store data, thus creating a PVC. The purpose of this scenario is to show the asb-etcd pod is created in a Nova AZ that has the same name as the Cinder AZ used to create PVCs. If there are no nodes available within the AZ to mount the volume, the pod cannot start.

We begin by investigating the StorageClass created at installation time. It includes information about the provisioner (Cinder) and some other parameters. It looks like this:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  ...
parameters:
  fstype: xfs
provisioner: kubernetes.io/cinder

After a successful installation, the pod has been created in the AZ2 and it is running in one of the AZ2 nodes (cicd-node-1):

asb-etcd-1-t8chg   1/1   Running   0       8m     10.130.2.3   cicd-node-1.cicd.com

If I delete the pod, it is rescheduled again to that same node (notice the different pod name):

asb-etcd-1-7ljpf   1/1   	Running   0      	19s   	10.130.2.4   cicd-node-1.cicd.com

If I drain and cordon that node (oc adm drain cicd-node-1.cicd.com), eventually it will be scheduled to another node available in AZ2 (in this case, cicd-infra-1.cicd.com):

asb-etcd-1-lp457   1/1   	Running   0      	2m    	10.128.4.6   cicd-infra-1.cicd.com

If I drain and cordon all the remaining nodes in AZ2 (so 0 nodes in AZ2 scheduleables), the pod won’t be able to be scheduled:

4s      5s       2      asb-etcd-1-s4cqc Pod                                                Warning   FailedScheduling     default-scheduler             0/4 nodes are available: 4 NoVolumeZoneConflict.

With no nodes available within the AZ2, this leaves the pod in a pending state, as expected, because the asb-etcd pod was only allowed to be scheduled within AZ2.

asb-etcd-1-s4cqc   0/1  Pending   0     41s

Scenario One Conclusion

Pods are scheduled to nodes where the AZ name for nova is the same as the cinder AZ name for the PV they use. If there is no available node in the AZ to mount the volume, the pod won’t be able to start.

But wait… What about when you have multiple Nova availability zones and just one Cinder availability zone? How do I handle the following scenario? The next blog in the series, "multiple Nova AZs with a single Cinder AZ" shifts the focus to answer these questions. It will explain why scenario one (multiple Nova AZs and multiple Cinder AZs) is recommended, but also how to handle environments where multiple Cinder AZs are not available.