Red Hat OpenShift Service on AWS (ROSA) is a fully managed application platform that offers a more seamless experience for building, deploying, and scaling applications. For machine learning (ML) workloads, ROSA now supports On-Demand Capacity Reservations (ODCR) and Capacity Blocks for ML, allowing cloud architects and platform administrators to strategically utilize their existing AWS purchases to help deliver uninterrupted access to essential compute infrastructure.
Today, ROSA is available in over 30 regions and supports over 600 instance types, allowing customers to run diverse workloads according to their business needs. However, maintaining guaranteed or uninterrupted access to a specific infrastructure type in a particular availability zone (AZ) is important for several critical scenarios:
- GPU-based accelerated computing workloads: Gaining uninterrupted access to accelerated computing (GPU) instances is vital for AI/ML teams conducting training, fine-tuning, or inference workloads. Capacity reservation helps eliminate the risk of compute unavailability for these time-sensitive, resource-intensive tasks.
- Planned scaling events: Enabling infrastructure scaling events to confidently support planned business events—such as peak traffic seasons, major product launches, or scheduled batch processing—without provisioning delays.
- High availability and disaster recovery: Enhancing resiliency by guaranteeing capacity when deploying workloads across multiple AZs or executing disaster recovery protocols across regions.
Amazon EC2 Capacity Reservations allow you to reserve compute capacity for your Amazon EC2 instances in a specific AZ for any duration. Capacity Blocks for ML allow you to reserve GPU-based accelerated computing instances on a future date to support your short duration ML workloads. With the support for Capacity Reservations for clusters with hosted control planes (HCP), platform administrators can now create ROSA machine pools in their cluster that directly consume the capacity already reserved with AWS.
Key best practices for effectively leveraging Capacity Reservations with ROSA:
- Pre-planning of AZs, instance types, and capacity: Before creation, ensure a precise match between the reserved capacity and the ROSA machine pool attributes. This includes VPC subnets, the number of node replicas, and the instance type. When reserving capacity for a future date, carefully balance the relative costs of purchasing capacities across different AZs with technical configurations like VPC subnet size, available IPs, and node replica requirements. You must wait until the AWS Capacity Reservation status is active before attempting to provision ROSA machine pools utilizing it.
- Informed decision on instance matching criteria: AWS provides two types of instance matching criteria for ODCRs: "Open" and "targeted." Choose a strategy based on your workload distribution. If you run multiple workloads across different services and intend to reserve capacity exclusively for your ROSA clusters, using the targeted matching criteria is strongly recommended. Remember that ODCRs operate on a ‘use it or lose it’ principle, as they are billed at on-demand rates regardless of utilization.
- Precise control over reserved capacity consumption: ROSA offers flexible controls to define how workloads should utilize EC2 instances across on-demand and capacity reservations. For example, decide whether you want the machine pool to either use on-demand instances as fall back or to fail when the configured capacity reservation is exhausted.
- Centralized management and allocation of purchases: For organizations managing multiple AWS accounts, the ability to centralize the purchase of ODCRs and allocate them across member accounts with AWS Resource Access Manager is a significant benefit. ROSA fully supports utilizing Capacity Reservations that are shared to the AWS account where the cluster is created, simplifying financial management and ensuring all teams benefit from reserved capacity.
- Proactive monitoring of Capacity Reservation utilization: Given that multiple workloads or accounts may share reservations, it's crucial to monitor Capacity Reservation utilization continuously. Cluster-specific utilization can fluctuate widely over time. Proactively planning for conditions, such as the exhaustion of reserved capacities, can prevent a ROSA cluster node from becoming unavailable for critical workloads.
To learn more about how to purchase Capacity Reservations and Capacity Blocks for ML, read the AWS documentation. To learn more about managing machine pools and setting capacity preferences in your ROSA cluster, read the Managing Nodes chapter in the ROSA documentation.
To get started with ROSA, visit the ROSA product page.
제품 체험판
Red Hat OpenShift Container Platform | 제품 체험판
저자 소개
Bala Chandrasekaran is a Product Manager in the Managed OpenShift Cloud Services. He has over 20 years of experience across cloud native technologies, infrastructure and data systems.
Brae Troutman is a Software Engineer supporting ROSA HCP Commercial and FedRAMP offerings. He's in his first 5 years of experience working in the world of Cloud Platforms as a Service, with special interest and focus in declarative configuration management and durable microservice approaches to cloud services, and continuous learning within his field.
유사한 검색 결과
Optimize Cloud Costs with Red Hat OpenShift Virtualization and ROSA on AWS
DxOperator from DH2i is now certified for Red Hat OpenShift 4.19
SREs on a plane | Technically Speaking
Red Hat OpenShift on AWS(ROSA) 시작하기
Red Hat® OpenShift® Service on AWS(ROSA)의 핸즈온 경험을 시작하세요.
ROSA 학습 허브 방문하기
ROSA에 대해 자세히 알아보고 학습 자료와 툴을 사용하여 문제를 해결해 보세요.
채널별 검색
오토메이션
기술, 팀, 인프라를 위한 IT 자동화 최신 동향
인공지능
고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트
오픈 하이브리드 클라우드
하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요
보안
환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보
엣지 컴퓨팅
엣지에서의 운영을 단순화하는 플랫폼 업데이트
인프라
세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보
애플리케이션
복잡한 애플리케이션에 대한 솔루션 더 보기
가상화
온프레미스와 클라우드 환경에서 워크로드를 유연하게 운영하기 위한 엔터프라이즈 가상화의 미래