Topology Spread Constraints
OpenShift Monitoring is a platform for monitoring and observability that is built on top of the Kubernetes container orchestration platform. It provides a comprehensive set of monitoring and alerting capabilities that allow you to monitor the health and performance of your applications running on OpenShift.
Since OpenShift 4.10, the monitoring component replicas are deployed with hard anti-affinity. This avoids the risk of a single node outage disrupting the cluster's monitoring functionality.
In OpenShift Monitoring 4.12, users have the ability to specify topology spread constraints for Prometheus, Alertmanager, and Thanos Ruler in addition to the existing hard anti-affinity settings. Topology spread constraints allow you to specify more complex rules that control the placement of these components on your cluster. For example, you might want to ensure that Prometheus instances are distributed across different failure domains in your cluster to further reduce the risk of a single point of failure. You can specify topology spread constraints using the openshift_monitoring_prometheus_topology_spread_constraints, openshift_monitoring_alertmanager_topology_spread_constraints, and openshift_monitoring_thanos_ruler_topology_spread_constraints variables in your OpenShift Monitoring installation configuration.
Overall, the ability to specify topology spread constraints can help improve the resiliency and availability of your monitoring and alerting infrastructure.
By using topology spread constraints, you can control the placement of pods across your cluster in order to achieve various goals. For example, you can use topology spread constraints to distribute pods evenly across different failure domains (such as zones or regions) in order to reduce the risk of a single point of failure. This can improve the resiliency of your applications and infrastructure.
Topology spread constraints can also be useful for improving network latency in certain scenarios. For example, if you have applications that need to communicate with each other over long distances, you can use topology spread constraints to ensure that the relevant pods are placed in the same zone or region in order to minimize network latency.
Overall, topology spread constraints provide you with a powerful tool for controlling the placement of pods within your cluster, which can help you optimize the performance and reliability of your applications.
Affinity
In OpenShift Observability, you can use affinity and topology constraints to control the placement of pods within your cluster. This can help you optimize the performance and reliability of your applications.
The central element of a topology spread constraint definition is the topology key. The topology key is a node label that associates a node with a particular facet of a cluster's topology. We recommend using well-known label names such as kubernetes.io/hostname and topology.kubernetes.io/region but any label will work. All nodes that have the same value for a particular topology key are considered to be in the same domain.
The label selector field specifies which existing pods are to be considered when a new pod should be scheduled. Other than that, only two more details must be specified: What should the scheduler do if it can not satisfy the constraints (whenUnsatisfiable) and whether the scheduler should tolerate any imbalance (maxSkew).
# Be sure that Alertmanager instances are evenly distributed across two failure domains (e.g., two different zones)
openshift_monitoring_alertmanager_topology_spread_constraints:
- topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
maxSkew: 1
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- alertmanager
# Be sure that Thanos Ruler instances are evenly distributed across three failure domains (e.g., three different regions)
openshift_monitoring_thanos_ruler_topology_spread_constraints:
- topologyKey: topology.kubernetes.io/region
whenUnsatisfiable: DoNotSchedule
maxSkew: 1
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- thanos-ruler
In these examples, the topologyKey field specifies the infrastructure level at which the topology spread constraint is applied (e.g., hostname, zone, region). The whenUnsatisfiable field specifies what should happen when it is not possible to satisfy the topology spread constraint (e.g., DoNotSchedule means that the pod should not be scheduled if the constraint cannot be satisfied). The maxSkew field specifies the maximum allowed imbalance between the number of pods scheduled in each topology. Finally, the labelSelector field specifies a label selector that is used to select the pods that the topology spread constraint should apply to.
Other updates for OpenShift Monitoring 4.12
In OpenShift Monitoring 4.12, admins have the ability to create new alerting rules based on platform metrics. This feature is available in Tech Preview, which means that it is still under development and may change in future releases.
Having the ability to create alerting rules based on platform metrics can be very useful for improving the management of alert rules. It allows admins to set up alerts that are triggered by specific metric values, which can help them detect and troubleshoot issues more quickly. This can be especially useful for monitoring the health and performance of applications running on OpenShift.
For more information check out the OpenShift Platform 4.12 release notes
저자 소개
Roger Florén, a dynamic and forward-thinking leader, currently serves as the Principal Product Manager at Red Hat, specializing in Observability. His journey in the tech industry is marked by high performance and ambition, transitioning from a senior developer role to a principal product manager. With a strong foundation in technical skills, Roger is constantly driven by curiosity and innovation. At Red Hat, Roger leads the Observability platform team, working closely with in-cluster monitoring teams and contributing to the development of products like Prometheus, AlertManager, Thanos and Observatorium. His expertise extends to coaching, product strategy, interpersonal skills, technical design, IT strategy and agile project management.
채널별 검색
오토메이션
기술, 팀, 인프라를 위한 IT 자동화 최신 동향
인공지능
고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트
오픈 하이브리드 클라우드
하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요
보안
환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보
엣지 컴퓨팅
엣지에서의 운영을 단순화하는 플랫폼 업데이트
인프라
세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보
애플리케이션
복잡한 애플리케이션에 대한 솔루션 더 보기
오리지널 쇼
엔터프라이즈 기술 분야의 제작자와 리더가 전하는 흥미로운 스토리
제품
- Red Hat Enterprise Linux
- Red Hat OpenShift Enterprise
- Red Hat Ansible Automation Platform
- 클라우드 서비스
- 모든 제품 보기
툴
체험, 구매 & 영업
커뮤니케이션
Red Hat 소개
Red Hat은 Linux, 클라우드, 컨테이너, 쿠버네티스 등을 포함한 글로벌 엔터프라이즈 오픈소스 솔루션 공급업체입니다. Red Hat은 코어 데이터센터에서 네트워크 엣지에 이르기까지 다양한 플랫폼과 환경에서 기업의 업무 편의성을 높여 주는 강화된 기능의 솔루션을 제공합니다.