Red Hat OpenShift, the most popular container orchestration platform, has always been about flexibility, scalability, and resilience. As workloads evolve, so do the requirements for resources such as CPU and memory. Traditionally, adjusting these resources for a running Pod meant recreating the Pod. However, with the concept of in-place resource resizing, this is changing. Let's dive into what in-place resource resizing is and why it's a game-changer for OpenShift users.

This feature is alpha in Kubernetes 1.27 and behind a feature gate in OpenShift 4.14.

What is In-place Resource Resize?

In-place resource resize refers to the ability to adjust the CPU and memory requests and limits of a running Pod without the need to recreate it. This feature allows for more dynamic resource management, ensuring that applications can be allocated more or fewer resources based on their current needs without causing disruptions.

Why is it Important?

Reduced Downtime: Recreating a Pod to adjust its resources can lead to downtime, especially if the Pod is part of a StatefulSet or if it's handling critical tasks. In-place resizing reduces this downtime, ensuring smoother operations.

Efficient Resource Utilization: Over-provisioning resources can lead to wastage, while under-provisioning can cause performance issues. Dynamic resizing ensures that resources are used efficiently, based on real-time needs.

Cost Savings: Efficient resource utilization can lead to cost savings, especially in cloud environments where you pay for the resources you use.

Simplified Operations: No need to manually intervene and recreate Pods or adjust deployment configurations. This simplifies the operational overhead.

How Does it Work?

Warning: Applying a CustomNoUpgrade FeatureSet as instructed below will render your cluster permanently unable to be upgraded. Do not use this procedure on anything important, or anything you ever intend to upgrade.

Enable The FeatureGate: Apply a CustomNoUpgrade FeatureSet containing the InPlacePodVerticalScaling FeatureGate:

apiVersion: config.openshift.io/v1
kind: FeatureGate
metadata:
name: cluster
spec:
customNoUpgrade:
  enabled:
   - InPlacePodVerticalScaling
   - AlibabaPlatform
   - BuildCSIVolumes
   - CloudDualStackNodeIPs
   - ExternalCloudProviderAzure
   - ExternalCloudProviderExternal
   - OpenShiftPodSecurityAdmission
   - PrivateHostedZoneAWS
featureSet: CustomNoUpgrade

 

(The CustomNoUpgrade FeatureSet supersedes the existing cluster default FeatureSet, it does not merge with it, so I have also included the FeatureGates enabled by the default FeatureSet)

Wait For The FeatureGate To Be Applied: It will take around 20 minutes for the kube-apiserver-operator to apply the FeatureGate change to all kube-apiserver instances and for the machine-config-operator to roll out the config change to each node’s kubelet and restart it.

Create A Pod: For our purposes you need to create a pod whose container limits and resources differ so it doesn’t get assigned the “Guaranteed” QoS class. Resize is not allowed if it would violate other pod mutability constraints, and the pod’s QoS class is still immutable.

apiVersion: v1
kind: Pod
metadata:
name: resizeme
spec:
containers:
- name: resizeme
  image: ubi9/ubi
  command: ["tail", "-f", "/dev/null"]
  resources:
    requests:
      cpu: 1
      memory: "512Mi"
    limits:
      cpu: 2
      memory: "1Gi"

 

Observe The New Pod/Container Resize Fields: There should now be resizePolicy fields populated in the container spec:

$ oc get pod resizeme -o yaml 
...
       containers:
       - command:
          - tail
            - -f
       - /dev/null
          image: ubi9/ubi
          imagePullPolicy: Always
          name: resizeme
          resizePolicy:
          - resourceName: cpu
              restartPolicy: NotRequired
          - resourceName: memory
              restartPolicy: NotRequired

 

And allocatedResources fields populated in the container status:

$ oc get pod resizeme -o yaml 
...
       containerStatuses:
       - allocatedResources:
         cpu: "1"
              memory: 512Mi

 

These should indicate that the in-place resize feature is now available.

Resize The Container’s Resources: Change the pod’s CPU request from 1 to 2. You can also use oc edit to make a change.

$ oc patch pod resizeme -p ' {"spec": {"containers": [{"name": "resizeme", "resources": { "requests" :{ "cpu" : 2, "memory": "512Mi"}, "limits" :{ "cpu" : 2, "memory" : "1Gi" } } }] }}'

 

Watch The Pod React: The resize doesn’t happen instantly. You will see a resize field appear in the pod status, and will see the pod go through a Proposed phase and an InProgress phase:

$ oc get pods resizeme -o jsonpath="{.status.resize}{'\n'}"
InProgress

 

Observe The Successful Resize:: Eventually, once the resize is complete, your resource changes will be reflected in container status:

$  oc get pods resizeme -o yaml
...
containerStatuses:
- allocatedResources:
        cpu: "2"
        memory: 512Mi
  containerID: cri-o://886b87d7b75a4eb5cddb265cb9991238ed002d7757208ac80aab05604057b24f
       image: registry.access.redhat.com/ubi9/ubi:latest
       imageID: registry.access.redhat.com/ubi9/ubi@sha256:351ed8b24d440c348486efd99587046e88bb966890a9207a5851d3a34a4dd346
       lastState: {}
       name: resizeme
       ready: true
       resources:
              limits:
              cpu: "4"
              memory: 1Gi
          requests:
              cpu: "2"
              memory: 512Mi

 

More details on configuration options and constraints can be found upstream here

Limitations and Considerations

While in-place resource resizing offers numerous benefits, there are some considerations:

Not All Resources Can Be Adjusted: While CPU and memory can be adjusted, other resources like storage are not currently supported for in-place resizing.

Potential for Resource Contention: If resources are reduced too aggressively, it might lead to resource contention among Pods.

Compatibility with Container Runtimes: Ensure that your container runtime supports dynamic resource adjustments.

Conclusion

In-place resource resizing for OpenShift Pods spec is a step towards more dynamic and efficient resource management. As OpenShift continues to evolve, features like this highlight its adaptability and responsiveness to the needs of modern applications and infrastructures. As always, while leveraging such features, it's essential to monitor and manage resources wisely to ensure optimal performance and cost-efficiency.

It's also kind of gross because we don't have "Hey has the cluster completely finished processing this featuregate change" thing anywhere.