Performance is a critical focus area of improvement with every release of Red Hat Enterprise Linux (RHEL). With RHEL 8.4, we introduced several key performance enhancements for system administrators and performance engineers to enable simpler troubleshooting experiences, richer visualization of performance metrics, and deeper performance insights. In this blog, we will go over some of these capabilities.
The newly improved RHEL web console
Just imagine that you only have a couple of minutes, and you want to quickly access live and historical performance metrics across CPU, memory, network and storage resources to understand their utilization, saturation and errors. What do you do?
With the new interface in the RHEL 8.4 web console, administrators can look at some key metrics and more easily identify where to start troubleshooting performance issues — they have a better understanding of what's going on with their local systems, as well as across the network environment.
If you’re new to the RHEL web console, the performance metrics can be found under the System -> Overview - Usage card. Before you can see the history of performance metrics, you need to install the cockpit-pcp package.
You can click the “Install cockpit-pcp” button within the Usage card section to install this package.
This new interface is divided into two sections. Across the top, we have four cards showing real-time data for CPU, Memory, Storage, and Network:
For CPU, you’ll see that we show utilization and saturation and then provide utilization metrics for CPU hot services. For memory, we show RAM utilization and swap usage, and then provide memory utilization metrics for the top services. For disks, we show read/write IO metrics and disk utilization percentages. For networking, we show bytes in and out across each network interface.
Along the bottom, provided you have pcp metric collection installed and enabled as explained earlier, you will see something similar to:
This graphic presents performance data from the most recent data to metrics collected as much as several weeks old. On the left, you get a list of events that may be of interest from a performance perspective, along with timestamps synced to the local machine clock. On the right, you get metrics representing CPU, Memory, Disks, and Network visualized in four columns.
For each column, utilization metrics are on the left and saturation metrics on the right of the center column line. Hovering over the graph will show you a tooltip with the precise numbers at a particular point in time. This allows you to get a better sense of where to start looking at when doing performance troubleshooting and when a performance problem might have started showing up..
Deeper Insights with Flame graphs
Now, there are times when the simple troubleshooting with the RHEL web console just cannot narrow down the root cause of a performance problem. In these cases, you need deeper insights into what’s going on, and for this reason, we have on-CPU flame graphs in RHEL.
This functionality allows you to profile what’s taking up cycles on your CPU and then determine where your CPU is spending a lot of cycles. Being able to visualize this quickly is a significant step forward from having to look at pages full of stack trace data.
Before proceeding, make sure to have the packages perf and js-d3-flame-graph installed on your system.
To capture flame graph data, you can run the perf
command with these options. After you’ve let the command run for a period of time (say 30 seconds), you’ll want to stop data collection by issuing a Ctrl-C in the terminal.
[root@bigdemo1 ~]# perf script record flamegraph -F 99 -g ^C [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 1.194 MB perf.data (3448 samples) ]
To output the flame graph, you can run :
[root@bigdemo1 ~]# perf script report flamegraph
Which will then report:
dumping data to flamegraph.html
When we browse to the result flamegraph.html
file, you will see something like the following:
The graph represents the 30 seconds of activity on the CPU that we profiled with perf
. You can read this visualization from the bottom to the top to see the flow of the stack traces. Horizontally, the more space occupied by part of the trace, the more cycles that it has occupied on the CPU. Kernel stacks are shown in a shade of orange and user space stacks have a “warm” hue.
If you look closely at the above chart, the first line on the bottom simply says “root”, indicating the root of the stack frame trees. Moving up a line, we get the processes, which in this flame graph are “SetroubleshootP, mysqld, apache2, 240998, rpm, ksoftirqd, kworker, grafana-server, and redis-server”.
If you zoom in a bit more (you can click on each stack frame), you will notice that mysqld happens to have the most red activity on the host:
As we can see from the above, the “hottest” code path is the “start_thread” path that initiates a mysql_select that invokes a MySQL JOIN operation. Knowing that JOINS are CPU intensive in a SQL database makes this diagram make sense and not necessarily something to worry about, but if we didn’t know that, then the above tells us that our JOINS are taking a decent bit of CPU time.
While there is nothing necessarily wrong with the CPU consumption pattern in this particular case, there are many times where an application is unfairly bottlenecking the CPU and “running hot”.
Now, we have only scratched the surface when it comes to performance enhancements in RHEL 8.4.
Stay tuned for our upcoming posts, where we will share additional performance tools in RHEL such as improvements to our PCP and Grafana stack. In the meantime, we hope that you’ll find these new features extremely useful and efficient in helping you get to the root of performance problems. It’s time to get started with RHEL 8.4.
저자 소개
Karl Abbott is a Senior Product Manager for Red Hat Enterprise Linux focused on the kernel and performance. Abbott has been at Red Hat for more than 15 years, previously working with customers in the financial services industry as a Technical Account Manager.
채널별 검색
오토메이션
기술, 팀, 인프라를 위한 IT 자동화 최신 동향
인공지능
고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트
오픈 하이브리드 클라우드
하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요
보안
환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보
엣지 컴퓨팅
엣지에서의 운영을 단순화하는 플랫폼 업데이트
인프라
세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보
애플리케이션
복잡한 애플리케이션에 대한 솔루션 더 보기
오리지널 쇼
엔터프라이즈 기술 분야의 제작자와 리더가 전하는 흥미로운 스토리
제품
- Red Hat Enterprise Linux
- Red Hat OpenShift Enterprise
- Red Hat Ansible Automation Platform
- 클라우드 서비스
- 모든 제품 보기
툴
체험, 구매 & 영업
커뮤니케이션
Red Hat 소개
Red Hat은 Linux, 클라우드, 컨테이너, 쿠버네티스 등을 포함한 글로벌 엔터프라이즈 오픈소스 솔루션 공급업체입니다. Red Hat은 코어 데이터센터에서 네트워크 엣지에 이르기까지 다양한 플랫폼과 환경에서 기업의 업무 편의성을 높여 주는 강화된 기능의 솔루션을 제공합니다.