Red Hat OpenShift, the most popular container orchestration platform, has always been about flexibility, scalability, and resilience. As workloads evolve, so do the requirements for resources such as CPU and memory. Traditionally, adjusting these resources for a running Pod meant recreating the Pod. However, with the concept of in-place resource resizing, this is changing. Let's dive into what in-place resource resizing is and why it's a game-changer for OpenShift users.
This feature is alpha in Kubernetes 1.27 and behind a feature gate in OpenShift 4.14.
What is In-place Resource Resize?
In-place resource resize refers to the ability to adjust the CPU and memory requests and limits of a running Pod without the need to recreate it. This feature allows for more dynamic resource management, ensuring that applications can be allocated more or fewer resources based on their current needs without causing disruptions.
Why is it Important?
Reduced Downtime: Recreating a Pod to adjust its resources can lead to downtime, especially if the Pod is part of a StatefulSet or if it's handling critical tasks. In-place resizing reduces this downtime, ensuring smoother operations.
Efficient Resource Utilization: Over-provisioning resources can lead to wastage, while under-provisioning can cause performance issues. Dynamic resizing ensures that resources are used efficiently, based on real-time needs.
Cost Savings: Efficient resource utilization can lead to cost savings, especially in cloud environments where you pay for the resources you use.
Simplified Operations: No need to manually intervene and recreate Pods or adjust deployment configurations. This simplifies the operational overhead.
How Does it Work?
Warning: Applying a CustomNoUpgrade FeatureSet as instructed below will render your cluster permanently unable to be upgraded. Do not use this procedure on anything important, or anything you ever intend to upgrade.
Enable The FeatureGate: Apply a CustomNoUpgrade FeatureSet containing the InPlacePodVerticalScaling FeatureGate:
apiVersion: config.openshift.io/v1
kind: FeatureGate
metadata:
name: cluster
spec:
customNoUpgrade:
enabled:
- InPlacePodVerticalScaling
- AlibabaPlatform
- BuildCSIVolumes
- CloudDualStackNodeIPs
- ExternalCloudProviderAzure
- ExternalCloudProviderExternal
- OpenShiftPodSecurityAdmission
- PrivateHostedZoneAWS
featureSet: CustomNoUpgrade
(The CustomNoUpgrade FeatureSet supersedes the existing cluster default FeatureSet, it does not merge with it, so I have also included the FeatureGates enabled by the default FeatureSet)
Wait For The FeatureGate To Be Applied: It will take around 20 minutes for the kube-apiserver-operator to apply the FeatureGate change to all kube-apiserver instances and for the machine-config-operator to roll out the config change to each node’s kubelet and restart it.
Create A Pod: For our purposes you need to create a pod whose container limits and resources differ so it doesn’t get assigned the “Guaranteed” QoS class. Resize is not allowed if it would violate other pod mutability constraints, and the pod’s QoS class is still immutable.
apiVersion: v1
kind: Pod
metadata:
name: resizeme
spec:
containers:
- name: resizeme
image: ubi9/ubi
command: ["tail", "-f", "/dev/null"]
resources:
requests:
cpu: 1
memory: "512Mi"
limits:
cpu: 2
memory: "1Gi"
Observe The New Pod/Container Resize Fields: There should now be resizePolicy fields populated in the container spec:
$ oc get pod resizeme -o yaml
...
containers:
- command:
- tail
- -f
- /dev/null
image: ubi9/ubi
imagePullPolicy: Always
name: resizeme
resizePolicy:
- resourceName: cpu
restartPolicy: NotRequired
- resourceName: memory
restartPolicy: NotRequired
And allocatedResources fields populated in the container status:
$ oc get pod resizeme -o yaml
...
containerStatuses:
- allocatedResources:
cpu: "1"
memory: 512Mi
These should indicate that the in-place resize feature is now available.
Resize The Container’s Resources: Change the pod’s CPU request from 1 to 2. You can also use oc edit to make a change.
$ oc patch pod resizeme -p ' {"spec": {"containers": [{"name": "resizeme", "resources": { "requests" :{ "cpu" : 2, "memory": "512Mi"}, "limits" :{ "cpu" : 2, "memory" : "1Gi" } } }] }}'
Watch The Pod React: The resize doesn’t happen instantly. You will see a resize field appear in the pod status, and will see the pod go through a Proposed phase and an InProgress phase:
$ oc get pods resizeme -o jsonpath="{.status.resize}{'\n'}"
InProgress
Observe The Successful Resize:: Eventually, once the resize is complete, your resource changes will be reflected in container status:
$ oc get pods resizeme -o yaml
...
containerStatuses:
- allocatedResources:
cpu: "2"
memory: 512Mi
containerID: cri-o://886b87d7b75a4eb5cddb265cb9991238ed002d7757208ac80aab05604057b24f
image: registry.access.redhat.com/ubi9/ubi:latest
imageID: registry.access.redhat.com/ubi9/ubi@sha256:351ed8b24d440c348486efd99587046e88bb966890a9207a5851d3a34a4dd346
lastState: {}
name: resizeme
ready: true
resources:
limits:
cpu: "4"
memory: 1Gi
requests:
cpu: "2"
memory: 512Mi
More details on configuration options and constraints can be found upstream here.
Limitations and Considerations
While in-place resource resizing offers numerous benefits, there are some considerations:
Not All Resources Can Be Adjusted: While CPU and memory can be adjusted, other resources like storage are not currently supported for in-place resizing.
Potential for Resource Contention: If resources are reduced too aggressively, it might lead to resource contention among Pods.
Compatibility with Container Runtimes: Ensure that your container runtime supports dynamic resource adjustments.
Conclusion
In-place resource resizing for OpenShift Pods spec is a step towards more dynamic and efficient resource management. As OpenShift continues to evolve, features like this highlight its adaptability and responsiveness to the needs of modern applications and infrastructures. As always, while leveraging such features, it's essential to monitor and manage resources wisely to ensure optimal performance and cost-efficiency.
It's also kind of gross because we don't have "Hey has the cluster completely finished processing this featuregate change" thing anywhere.
关于作者
产品
工具
试用购买与出售
沟通
关于红帽
我们是世界领先的企业开源解决方案供应商,提供包括 Linux、云、容器和 Kubernetes。我们致力于提供经过安全强化的解决方案,从核心数据中心到网络边缘,让企业能够更轻松地跨平台和环境运营。