Today we're proud to announce the open sourcing of our montioring scripts. The OpenShift Online Operations Team has published the OpenShift-Zabbix repository containing the monitoring scripts to monitor an OpenShift installation.
We use these scripts to monitor OpenShift Online environment using Zabbix. They are aimed at giving OpenShift Enterprise and OpenShift Origin users a good starting point for monitoring their OpenShift deployments as well.
Repository Structure
The OpenShift-Zabbix repository is structured in the standard Puppet module format. We don't expect every consumer to also use Puppet in their infrastructure. Puppet is the tool used in the OpenShift Online environment and also provides documentation on how the scripts are intended to be deployed. Users can consume the Puppet manifests as-is, or use them as a guide to integrate the scripts into their own configuration management infrastructure.
The scripts themselves reside in the files/checks/ directory. There are library files in the files/lib/ directory. These files are expected to be deployed to a bin/ and lib/ directory. (e.g. /usr/local/bin and /usr/local/lib)
Also included in the repository is the files/xml/ directory. This directory contains XML-based template files used by Zabbix to create items, triggers, and graphs. These files will make it easy to quickly configure Zabbix for monitoring the supplied data points.
Finally, the manifests/ directory contains the configuration documentation in the form of Puppet code. The primary details that can be found here are package requirements the scripts will need and the cron jobs that ultimately execute the check scripts to push data into the Zabbix server.
Design Decisions
The decision to use cron to execute the monitoring checks is a result of operational experiences gained by the OpenShift Online Operations Team. We found that Zabbix agent checks tend to have difficulty running at scale. As the number of items grows, the zabbix-agent and the zabbix-server's poller processes can struggle to collect data in a timely manner.
Our solution to this is to run all of our checks using the zabbix_sender command, which reads a file containing item data, and will push that data up to the Zabbix server. Some of our checks involve checking things with inherent and unpredictable latency, such as a ping through MCollective from the Broker to the Nodes. This made cron a reasonable choice, given the tradeoffs we needed to make between having fast checks and monitoring inherently variable components.
Current Checks
With the initial release, there are five check scripts that are included in the repository. Here is a brief description of what each one does.
- check-accept-node
- Runs the 'oo-accept-node' command, attempts some automated fixes to known bugs/problems, and then returns the command status.
- check-activemq-stats
- Collects statistics within the JVM running ActiveMQ for tracking common performance data.
- check-district-capacity
- Runs the 'oo-stats' command to collect data about capacity utilization of an OpenShift district. Reports available uids and gears within a district.
- check-mc-ping
- Runs 'mco ping', collates the results, and identifies nodes that are not responding in a timely fashion.
- check-user-action-log
- Scans
/var/log/openshift/user_action.log, parsing event data from the log to provide insight into the health of the broker and general user experience interacting with the service.
- Scans
Get started!
The OpenShift-Zabbix repository should contain everything you need to get started monitoring your OpenShift deployment. The README.md documents, everything you'll need to get started. If you're interested in contributing and collaborating with us, fork the code and send us a pull request; more information is in the COLLABORATING.md file.
For us, this is a starting point for open sourcing more of our monitoring work over time. We are interested in providing examples and ideas about how to make the most of your OpenShift installation. In addition, we would like to encourage all operations teams running OpenShift to engage in a conversation about how we keep our infrastructure running. We are excited to help make running OpenShift in a production environment an even easier and better experience.
Next Steps
- Try these scripts out in your environment and let us know what you think in the comments
- Don't have a running OpenShift environment to try this out? Install OpenShift Origin with one shell command and you'll be able to do it in no time.
- Watch the video above to learn monitoring best practices for OpenShift
저자 소개
유사한 검색 결과
Connect, collaborate, and grow: Your guide to Ecosystem Success Day at Red Hat Summit 2026
233% 3-year return on investment and 13 months to payback with Red Hat AI
Collaboration In Product Security | Compiler
Keeping Track Of Vulnerabilities With CVEs | Compiler
채널별 검색
오토메이션
기술, 팀, 인프라를 위한 IT 자동화 최신 동향
인공지능
고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트
오픈 하이브리드 클라우드
하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요
보안
환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보
엣지 컴퓨팅
엣지에서의 운영을 단순화하는 플랫폼 업데이트
인프라
세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보
애플리케이션
복잡한 애플리케이션에 대한 솔루션 더 보기
가상화
온프레미스와 클라우드 환경에서 워크로드를 유연하게 운영하기 위한 엔터프라이즈 가상화의 미래