피드 구독

As an OpenShift Container platform operator, managing resources on nodes is one of the most important tasks. Setting LimitRange and Quota are the right way to limit resources. Many blog posts cover the Quota and LimitRange from the OpenShift Container Platform perspective, but they do not explain the relationship between those objects in Kubernetes and control groups (cgroups) in the Linux kernel. Since I haven’t seen this covered elsewhere, I decided I’d get into the connection with particular attention to the CPU/memory limit.

Needed: basic knowledge of Red Hat Enterprise Linux 7

Red Hat OpenShift has Red Hat Enterprise Linux at the foundation. In order to understand Quota and LimitRange in OpenShift, we need to take a look at Red Hat Enterprise Linux stuff first. I will cover basic information in Red Hat Enterprise Linux 7 regarding cgroups, systemd and so on.

Control groups

The cgroups feature has existed in Linux for quite some time, but it has become more prominent because of Linux containers and Kubernetes recently. It allows us to limit the resource usage of processes. In Red Hat Enterprise Linux 7, we can use cgroups by default and systemd to help mount important resource controllers in the /sys/fs/cgroups directory.

Systemd

The systemd system and service manager is responsible for controlling how services are started, stopped and otherwise managed on Red Hat Enterprise Linux 7 systems.

Terms:

  • Slice:  A slice unit, according to the systemd.slice man page, is a concept for hierarchically managing resources of a group of processes. A slice divides up computer resources (such as CPU and memory) and apply them to selected units.

  • Scope: A process that is created by another process not systemd.  Unlike service units, scope units manage externally created processes, and does not fork off processes on its own.

  • Service: A unit configuration file whose name ends in ".service" encodes information about a process controlled and supervised by systemd.

The relationship between slice, scope, service and processes

Let’s take a quick look at how these terms relate to one another. A slice organizes scopes and services hierarchies. Processes are attached to services or scopes, not slices.

We know the definition so now let’s try to do actual battle. This example command is from an article by Frederic Giloux: Controlling resources with cgroups for performance testing. Here we create a scope called “fredunit” and then call its status using systemctl.

Scope:

Figure 1: systemd-run --unit=fredunit --scope --slice=fredslice sh

Service:

This next example will use a systemd service. If you’d like to learn more about services, Jayaraj Deenadayalan has written a good article to help us understand a Red Hat Enterprise Linux 7 systemd unit file, and how to generate one from traditional sysV init scripts.

Figure 2: systemd-run --unit=fredunit --slice=fredslice -r sh

Slice:

The next example shows what the slice looks, and as you see -- the slice organizes the scope and service hierarchies.

Figure 3: systemctl status fredslice.slice

Resource management in cgroups

  • Slices will divide many different types, with four default cgroups:

    • The “root” slice

    • Users

    • Services

    • Machine

    • And other slices

  • A Slice with its own cgroup lets you control the amount of resource.

    • Processes under a slice share resources.

    • A slice can set CPU/Memory Limit.

    • A systemd unit is always associated with its own cgroup

    • With systemd's use of cgroups, precise limits can be set on CPU and memory usage, as well as other resources.

Useful systemd commands

To see what services and other units (service, mount, path, socket, and so on) are associated with a particular target, type this command:

systemctl list-dependencies multi-user.target

To see dependencies of a service, use the list-dependencies option:

systemctl list-dependencies atomic-openshift-node.service

To list specific types of units:

systemctl list-units --type service

systemctl list-units --type mount    

To list all units installed on the system, along with their current states:

systemctl list-unit-files

To view processes associated with a particular service (cgroup) - Once systemd-cgtop is running, you can press keys to sort by memory (m), CPU (c), task (t), path (p), or I/O load (i):

systemd-cgtop

To output a recursive list of cgroup content:

systemd-cgls

Cgroups limit

Using cgroups, we can divide resources for each process. From a Red Hat Enterprise Linux perspective, this is how we set the limit for CPU/memory, and how to monitor assigned resources by cgroups.

I created two different scenarios to set limit in cgroup. Following external url will give you the detailed steps.

Now, that we know how to set the limit, let’s test it. To do this, we will give load for memory/cpu. Let’s see if the limit config is really blocking the process to not exceed limit resources.

Lastly, I will try to make a similar Slice that kubernetes uses and I hope that it gives you insight into how Kubernetes uses cgroups for LimitRange. Basically, Kubernetes uses one of three Quality of Service (QoS) classes: Burstable, Guaranteed, or BestEffort and creates slices based on the QoS. The way to generate slices is by creating folders under /sys/fs/cgroup. It looks at the chain of slices.

The results of Scenario 4 are briefly summarized as follows:

  1. The slices that I created
    Figure 4: Self-created slices

  2. The slices that Kubernetes created
    Figure 5: Kubernetes created slice

What do you think? They are very similar each other, aren’t they?

Conclusion

Red Hat OpenShift Container Platform/Kubernetes uses cgroups because it uses containers. Which means the way to set limits should be the same. OpenShift Container Platform/Kubernetes uses QoS (Quality of Service) and the chain of slices will be created recursively in /sys/fs/cgroup based on QoS. The chain of slices allows each container to set limits for resources like a normal process. Through the series of demo scenarios, I hope you have better understanding of how OpenShift Container Platform sets limits by cgroups.

I would like to thank to Frédéric Giloux and Marc Richter. This blog is written on top of their wonderful blogs: ”controlling resources with cgroups for performance testing,” and the ”world domination with cgroups” series.


저자 소개

Jooho Lee is a senior OpenShift Technical Account Manager (TAM) in Toronto supporting middleware products(EAP/ DataGrid/ Web Server) and cloud products (Docker/ Kubernetes/ OpenShift/ Ansible). He is an active member of JBoss User Group Korea and Openshift / Ansible Group. 

Read full bio
UI_Icon-Red_Hat-Close-A-Black-RGB

채널별 검색

automation icon

오토메이션

기술, 팀, 인프라를 위한 IT 자동화 최신 동향

AI icon

인공지능

고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트

open hybrid cloud icon

오픈 하이브리드 클라우드

하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요

security icon

보안

환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보

edge icon

엣지 컴퓨팅

엣지에서의 운영을 단순화하는 플랫폼 업데이트

Infrastructure icon

인프라

세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보

application development icon

애플리케이션

복잡한 애플리케이션에 대한 솔루션 더 보기

Original series icon

오리지널 쇼

엔터프라이즈 기술 분야의 제작자와 리더가 전하는 흥미로운 스토리