Red Hat blog
By Annette Clewett and Husnain Bustam
Hopefully by now you've seen that with the release of Red Hat OpenShift Container Platform 3.10 we've rebranded our container-native storage (CNS) offering to be called Red Hat OpenShift Container Storage (OCS). Versioning remains sequential (i.e, OCS 3.10 is the follow on to CNS 3.9).
OCS 3.10 introduces important features for container-based storage with OpenShift. Arbiter volume support allows for there to be only two replica copies of the data, while still providing split-brain protection and ~30% savings in storage infrastructure versus a replica-3 volume. This release also hardens block support for backing OpenShift infrastructure services. In addition to supporting arbiter volumes, major improvements to ease operations are available to give you the ability to monitor provisioned storage consumption, expand persistent volume (PV) capacity without downtime to the application, and use a more intuitive naming convention for PVs.
For easy evaluation of these features, an OpenShift Container Platform evaluation subscription now includes access to OCS evaluation binaries and subscriptions.
Now let’s dive deeper into the new features of the OCS 3.10 release:
- Prometheus OCS volume metrics: Volume consumption metrics data (e.g., volume capacity, available space, number of inodes in use, number of inodes free) available in Prometheus for OCS are very useful. These metrics monitor storage capacity and consumption trends and take timely actions to ensure applications do not get impacted.
- Heketi topology and configuration metrics: Available from the Heketi HTTP metrics service endpoint, these metrics can be viewed using Prometheus or curl http://<heketi_service_route>/metrics. These metrics can be used to query heketi health, number of nodes, number of devices, device usage, and cluster count.
- Online expansion of provisioned storage: You can now expand the OCS-backed PVs within OpenShift by editing the corresponding claim (oc edit pvc <claim_name>) with the new desired capacity (spec→ requests → storage: new value).
- Custom volume naming: Before this release, the names of the dynamically provisioned GlusterFS volumes were auto-generated with random uuid number. Now, by adding a custom volume name prefix, the GlusterFS volume name will include the namespace or project as well as the claim name, thereby making it much easier to map to a particular workload.
- Arbiter volumes: Arbiter volumes allow for reduced storage consumption and better performance across the cluster while still providing the redundancy and reliability expected of GlusterFS.
Volume and Heketi metrics
As of OCP 3.10 and OCS 3.10, the following metrics are available in Prometheus (and by executing curl http://<heketi_service_route>/metrics):
|kubelet_volume_stats_available_bytes:||Number of available bytes in the volume|
|kubelet_volume_stats_capacity_bytes:||Capacity in bytes of the volume|
|kubelet_volume_stats_inodes:||Maximum number of inodes in the volume|
|kubelet_volume_stats_inodes_free:||Number of free inodes in the volume|
|kubelet_volume_stats_inodes_used:||Number of used inodes in the volume|
|kubelet_volume_stats_used_bytes:||Number of used bytes in the volume|
|heketi_cluster_count:||Number of clusters|
|heketi_device_brick_count:||Number of bricks on device|
|heketi_device_count:||Number of devices on host|
|heketi_device_free:||Amount of free space available on the device|
|heketi_device_size:||Total size of the device|
|heketi_device_used:||Amount of space used on the device|
|heketi_nodes_count:||Number of nodes on the cluster|
|heketi_up:||Verifies if heketi is running|
|heketi_volumes_count:||Number of volumes on cluster|
Populating Heketi metrics in Prometheus requires additional configuration of the Heketi service. You must add the bolded annotations using the following commands:
# oc annotate svc heketi-storage prometheus.io/scheme=http # oc annotate svc heketi-storage prometheus.io/scrape=true # oc describe svc heketi-storage Name: heketi-storage Namespace: app-storage Labels: glusterfs=heketi-storage-service heketi=storage-service Annotations: description=Exposes Heketi service prometheus.io/scheme=http prometheus.io/scrape=true Selector: glusterfs=heketi-storage-pod Type: ClusterIP IP: 172.30.90.87 Port: heketi 8080/TCP TargetPort: 8080/TCP
Populating Heketi metrics in Prometheus also requires additional configuration of the Prometheus configmap. As shown in the following, you must modify the Prometheus configmap with the namespace of Hekti service and restart prometheus-0 pod:
# oc get svc --all-namespaces | grep heketi appstorage heketi-storage ClusterIP 172.30.90.87 <none> 8080/TCP # oc get cm prometheus -o yaml -n openshift-metrics .... - job_name: 'kubernetes-service-endpoints' ... relabel_configs: # only scrape infrastructure components - source_labels: [__meta_kubernetes_namespace] action: keep regex: 'default|logging|metrics|kube-.+|openshift|openshift-.+|app-storage' # oc scale --replicas=0 statefulset.apps/prometheus # oc scale --replicas=1 statefulset.apps/prometheus
Online expansion of GlusterFS volumes and custom naming
First, let's discuss what's needed to allow expansion of GlusterFS volumes. This opt-in feature is enabled by configuring the StorageClass for OCS with the parameter allowVolumeExpansion set to "true," enabling the feature gate ExpandPersistentVolumes. You can now dynamically resize storage volumes attached to containerized applications without needing to first detach and then attach a storage volume with increased capacity, which enhances application availability and uptime.
Enable the ExpandPersistentVolumes feature gate on all master nodes:
# vim /etc/origin/master/master-config.yaml kubernetesMasterConfig: apiServerArguments: feature-gates: - ExpandPersistentVolumes=true # /usr/local/bin/master-restart api # /usr/local/bin/master-restart controllers
This release also supports adding a custom volume name prefix created with the volume name prefix, project name/namespace, claim name, and UUID (<myPrefix>_<namespace>_<claimname>_UUID). Parameterizing the StorageClass ( `volumenameprefix: myPrefix`) allows easier identification of volumes in the GlusterFS backend.
The new OCS PVs will be created with the volume name prefix, project name/namespace, claim name, and UUID (<myPrefix>_<namespace>_<claimname>_UUID), making it easier for you to automate day-2 admin tasks like backup and recovery, applying policies based on pre-ordained volume nomenclature, and other day-2 housekeeping tasks.
In this StorageClass, support for both online expansion of OCS/GlusterFS PVs and custom volume naming has been added.
# oc get sc glusterfs-storage -o yaml apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: glusterfs-storage parameters: resturl: http://heketi-storage-storage.apps.ose-master.example.com restuser: admin secretName: heketi-storage-admin-secret secretNamespace: storage volumenameprefix: gf ❶ allowVolumeExpansion: true ❷ provisioner: kubernetes.io/glusterfs reclaimPolicy: Delete
❶ Custom volume name support: <volumenameprefixstring>_<namespace>_<claimname>_UUID
❷ Parameter needed for online expansion or resize of GlusterFS PVs
Be aware that PV expansion is not supported for block volumes, only for file volumes.
Expanding a volume starts with editing the PVC field "requests:storage" with the new expanded size for the PersistentVolume. For example, we have 1GiB PV, we want to expand the PV to 2GiB. To expand/resize PV to 2GiB, edit the PVC field "requests:storage" with the new value. The PV will be automatically resized to 2GiB. The new 2GiB size will be reflected in OCP, heketi-cli, and gluster commands. The expansion process creates another replica set and converts the 3-way replicated volume to distributed-replicated volume, 2x3 instead of 1x3 bricks.
GlusterFS arbiter volumes
Arbiter volume support is new to OCS 3.10 and has the following advantages:
- An arbiter volume is still a 3-way replicated volume for highly available storage.
- Arbiter bricks do not store file data; they only store file names, structure, and metadata.
- Arbiter uses client quorum to compare this metadata with metadata of other nodes to ensure consistency of the volume and prevent split brain conditions.
- Using Heketi commands, it is possible to control arbiter brick placement using tagging so that all arbiter bricks are on the same node.
- With control of arbiter brick placement, the ‘arbiter’ node can have limited storage compared to other nodes in the cluster.
The following example has two gluster volumes configured across 5 nodes to create two 3-way arbitrated replicated volumes, with the arbiter bricks on a dedicated arbiter node.
In order to use arbiter volumes with OCP workloads, an additional parameter must be added to the GlusterFS StorageClass, user.heketi.arbiter true. In this StorageClass, support for the online expansion of GlusterFS PVs, custom volume naming, and arbiter volumes have been added.
# oc get sc glusterfs-storage -o yaml apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: glusterfs-storage parameters: resturl: http://heketi-storage-storage.apps.ose-master.example.com restuser: admin secretName: heketi-storage-admin-secret secretNamespace: storage volumenameprefix: gf ❶ volumeoptions: user.heketi.arbiter true ❸ allowVolumeExpansion: true ❷ provisioner: kubernetes.io/glusterfs reclaimPolicy: Delete
❶ Custom volume name support: <volumenameprefixstring>_<namespace>_<claimname>_UUID
❷ Parameter needed for online expansion or resize of GlusterFS volumes
❸ Enable arbiter volume support in the StorageClass. All the PVs created from this StorageClass will be 3-way arbitrated replicated volume.
Want to learn more?
For hands-on experience combining OpenShift and OCS, check out our test drive, a free, in-browser lab experience that walks you through using both. Also, check out this short video explaining why using OCS with OpenShift is the right choice for the container storage infrastructure. For details on running OCS 3.10 with OCP 3.10, click here.