Red Hat OpenShift Service on AWS supports Capacity Reservations and Capacity Blocks for Machine Learning

2025년 12월 3일Bala Chandrasekaran, Brae Troutman3분 읽기

Red Hat OpenShift Service on AWS (ROSA) is a fully managed application platform that offers a more seamless experience for building, deploying, and scaling applications. For machine learning (ML) workloads, ROSA now supports On-Demand Capacity Reservations (ODCR) and Capacity Blocks for ML, allowing cloud architects and platform administrators to strategically utilize their existing AWS purchases to help deliver uninterrupted access to essential compute infrastructure.

Today, ROSA is available in over 30 regions and supports over 600 instance types, allowing customers to run diverse workloads according to their business needs. However, maintaining guaranteed or uninterrupted access to a specific infrastructure type in a particular availability zone (AZ) is important for several critical scenarios:

GPU-based accelerated computing workloads: Gaining uninterrupted access to accelerated computing (GPU) instances is vital for AI/ML teams conducting training, fine-tuning, or inference workloads. Capacity reservation helps eliminate the risk of compute unavailability for these time-sensitive, resource-intensive tasks.
Planned scaling events: Enabling infrastructure scaling events to confidently support planned business events—such as peak traffic seasons, major product launches, or scheduled batch processing—without provisioning delays.
High availability and disaster recovery: Enhancing resiliency by guaranteeing capacity when deploying workloads across multiple AZs or executing disaster recovery protocols across regions.

Amazon EC2 Capacity Reservations allow you to reserve compute capacity for your Amazon EC2 instances in a specific AZ for any duration. Capacity Blocks for ML allow you to reserve GPU-based accelerated computing instances on a future date to support your short duration ML workloads. With the support for Capacity Reservations for clusters with hosted control planes (HCP), platform administrators can now create ROSA machine pools in their cluster that directly consume the capacity already reserved with AWS.

Key best practices for effectively leveraging Capacity Reservations with ROSA:

Pre-planning of AZs, instance types, and capacity: Before creation, ensure a precise match between the reserved capacity and the ROSA machine pool attributes. This includes VPC subnets, the number of node replicas, and the instance type. When reserving capacity for a future date, carefully balance the relative costs of purchasing capacities across different AZs with technical configurations like VPC subnet size, available IPs, and node replica requirements. You must wait until the AWS Capacity Reservation status is active before attempting to provision ROSA machine pools utilizing it.
Informed decision on instance matching criteria: AWS provides two types of instance matching criteria for ODCRs: "Open" and "targeted." Choose a strategy based on your workload distribution. If you run multiple workloads across different services and intend to reserve capacity exclusively for your ROSA clusters, using the targeted matching criteria is strongly recommended. Remember that ODCRs operate on a ‘use it or lose it’ principle, as they are billed at on-demand rates regardless of utilization.
Precise control over reserved capacity consumption: ROSA offers flexible controls to define how workloads should utilize EC2 instances across on-demand and capacity reservations. For example, decide whether you want the machine pool to either use on-demand instances as fall back or to fail when the configured capacity reservation is exhausted.
Centralized management and allocation of purchases: For organizations managing multiple AWS accounts, the ability to centralize the purchase of ODCRs and allocate them across member accounts with AWS Resource Access Manager is a significant benefit. ROSA fully supports utilizing Capacity Reservations that are shared to the AWS account where the cluster is created, simplifying financial management and ensuring all teams benefit from reserved capacity.
Proactive monitoring of Capacity Reservation utilization: Given that multiple workloads or accounts may share reservations, it's crucial to monitor Capacity Reservation utilization continuously. Cluster-specific utilization can fluctuate widely over time. Proactively planning for conditions, such as the exhaustion of reserved capacities, can prevent a ROSA cluster node from becoming unavailable for critical workloads.

To learn more about how to purchase Capacity Reservations and Capacity Blocks for ML, read the AWS documentation. To learn more about managing machine pools and setting capacity preferences in your ROSA cluster, read the Managing Nodes chapter in the ROSA documentation.

To get started with ROSA, visit the ROSA product page.

저자 소개

Bala Chandrasekaran

Principal Product Manager

Bala Chandrasekaran is a Product Manager in the Managed OpenShift Cloud Services. He has over 20 years of experience across cloud native technologies, infrastructure and data systems.

Read full bio

Brae Troutman

Software Engineer

Brae Troutman is a Software Engineer supporting ROSA HCP Commercial and FedRAMP offerings. He's in his first 5 years of experience working in the world of Cloud Platforms as a Service, with special interest and focus in declarative configuration management and durable microservice approaches to cloud services, and continuous learning within his field.

Read full bio

유사한 검색 결과

Blog post

채널별 검색

모든 채널 탐색

Red Hat OpenShift Service on AWS supports Capacity Reservations and Capacity Blocks for Machine Learning

Red Hat OpenShift Container Platform | 제품 체험판

저자 소개

Bala Chandrasekaran

Brae Troutman

유사한 검색 결과

Red Hat OpenShift on AWS(ROSA) 시작하기

채널별 검색

플랫폼

툴

체험, 구매 & 영업

커뮤니케이션

Red Hat 소개

Change page language

Red Hat legal and privacy links

Red Hat legal and privacy links