As an OpenShift Container platform operator, managing resources on nodes is one of the most important tasks. Setting LimitRange and Quota are the right way to limit resources. Many blog posts cover the Quota and LimitRange from the OpenShift Container Platform perspective, but they do not explain the relationship between those objects in Kubernetes and control groups (cgroups) in the Linux kernel. Since I haven’t seen this covered elsewhere, I decided I’d get into the connection with particular attention to the CPU/memory limit.
Needed: basic knowledge of Red Hat Enterprise Linux 7
Red Hat OpenShift has Red Hat Enterprise Linux at the foundation. In order to understand Quota and LimitRange in OpenShift, we need to take a look at Red Hat Enterprise Linux stuff first. I will cover basic information in Red Hat Enterprise Linux 7 regarding cgroups, systemd and so on.
Control groups
The cgroups feature has existed in Linux for quite some time, but it has become more prominent because of Linux containers and Kubernetes recently. It allows us to limit the resource usage of processes. In Red Hat Enterprise Linux 7, we can use cgroups by default and systemd to help mount important resource controllers in the /sys/fs/cgroups directory.
Systemd
The systemd system and service manager is responsible for controlling how services are started, stopped and otherwise managed on Red Hat Enterprise Linux 7 systems.
Terms:
-
Slice: A slice unit, according to the systemd.slice man page, is a concept for hierarchically managing resources of a group of processes. A slice divides up computer resources (such as CPU and memory) and apply them to selected units.
-
Scope: A process that is created by another process not systemd. Unlike service units, scope units manage externally created processes, and does not fork off processes on its own.
-
Service: A unit configuration file whose name ends in ".service" encodes information about a process controlled and supervised by systemd.
The relationship between slice, scope, service and processes
Let’s take a quick look at how these terms relate to one another. A slice organizes scopes and services hierarchies. Processes are attached to services or scopes, not slices.
We know the definition so now let’s try to do actual battle. This example command is from an article by Frederic Giloux: Controlling resources with cgroups for performance testing. Here we create a scope called “fredunit” and then call its status using systemctl.
Scope:
Service:
This next example will use a systemd service. If you’d like to learn more about services, Jayaraj Deenadayalan has written a good article to help us understand a Red Hat Enterprise Linux 7 systemd unit file, and how to generate one from traditional sysV init scripts.
Slice:
The next example shows what the slice looks, and as you see -- the slice organizes the scope and service hierarchies.
Resource management in cgroups
-
Slices will divide many different types, with four default cgroups:
-
The “root” slice
-
Users
-
Services
-
Machine
-
And other slices
-
-
A Slice with its own cgroup lets you control the amount of resource.
-
Processes under a slice share resources.
-
A slice can set CPU/Memory Limit.
-
A systemd unit is always associated with its own cgroup
-
With systemd's use of cgroups, precise limits can be set on CPU and memory usage, as well as other resources.
-
Useful systemd commands
To see what services and other units (service, mount, path, socket, and so on) are associated with a particular target, type this command:
systemctl list-dependencies multi-user.target
To see dependencies of a service, use the list-dependencies option:
systemctl list-dependencies atomic-openshift-node.service
To list specific types of units:
systemctl list-units --type service systemctl list-units --type mount
To list all units installed on the system, along with their current states:
systemctl list-unit-files
To view processes associated with a particular service (cgroup) - Once systemd-cgtop is running, you can press keys to sort by memory (m), CPU (c), task (t), path (p), or I/O load (i):
systemd-cgtop
To output a recursive list of cgroup content:
systemd-cgls
Cgroups limit
Using cgroups, we can divide resources for each process. From a Red Hat Enterprise Linux perspective, this is how we set the limit for CPU/memory, and how to monitor assigned resources by cgroups.
I created two different scenarios to set limit in cgroup. Following external url will give you the detailed steps.
-
Scenario 1 : Use cpuset hierarchy creating folder /sys/fs/cgroup (scope mode)
-
Scenario 2 : Use conf file to set cpu/memory amount for limit (service mode)
Now, that we know how to set the limit, let’s test it. To do this, we will give load for memory/cpu. Let’s see if the limit config is really blocking the process to not exceed limit resources.
Lastly, I will try to make a similar Slice that kubernetes uses and I hope that it gives you insight into how Kubernetes uses cgroups for LimitRange. Basically, Kubernetes uses one of three Quality of Service (QoS) classes: Burstable, Guaranteed, or BestEffort and creates slices based on the QoS. The way to generate slices is by creating folders under /sys/fs/cgroup. It looks at the chain of slices.
The results of Scenario 4 are briefly summarized as follows:
-
The slices that I created
-
The slices that Kubernetes created
What do you think? They are very similar each other, aren’t they?
Conclusion
Red Hat OpenShift Container Platform/Kubernetes uses cgroups because it uses containers. Which means the way to set limits should be the same. OpenShift Container Platform/Kubernetes uses QoS (Quality of Service) and the chain of slices will be created recursively in /sys/fs/cgroup based on QoS. The chain of slices allows each container to set limits for resources like a normal process. Through the series of demo scenarios, I hope you have better understanding of how OpenShift Container Platform sets limits by cgroups.
I would like to thank to Frédéric Giloux and Marc Richter. This blog is written on top of their wonderful blogs: ”controlling resources with cgroups for performance testing,” and the ”world domination with cgroups” series.
About the author
Jooho Lee is a senior OpenShift Technical Account Manager (TAM) in Toronto supporting middleware products(EAP/ DataGrid/ Web Server) and cloud products (Docker/ Kubernetes/ OpenShift/ Ansible). He is an active member of JBoss User Group Korea and Openshift / Ansible Group.
Browse by channel
Automation
The latest on IT automation for tech, teams, and environments
Artificial intelligence
Updates on the platforms that free customers to run AI workloads anywhere
Open hybrid cloud
Explore how we build a more flexible future with hybrid cloud
Security
The latest on how we reduce risks across environments and technologies
Edge computing
Updates on the platforms that simplify operations at the edge
Infrastructure
The latest on the world’s leading enterprise Linux platform
Applications
Inside our solutions to the toughest application challenges
Original shows
Entertaining stories from the makers and leaders in enterprise tech
Products
- Red Hat Enterprise Linux
- Red Hat OpenShift
- Red Hat Ansible Automation Platform
- Cloud services
- See all products
Tools
- Training and certification
- My account
- Customer support
- Developer resources
- Find a partner
- Red Hat Ecosystem Catalog
- Red Hat value calculator
- Documentation
Try, buy, & sell
Communicate
About Red Hat
We’re the world’s leading provider of enterprise open source solutions—including Linux, cloud, container, and Kubernetes. We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.
Select a language
Red Hat legal and privacy links
- About Red Hat
- Jobs
- Events
- Locations
- Contact Red Hat
- Red Hat Blog
- Diversity, equity, and inclusion
- Cool Stuff Store
- Red Hat Summit