Behind the queues: How Kueue reimagines scheduling in Red Hat OpenShift

2025 年 11 月 17 日Pannaga Rao Bhoja Ramamanohara, Sohan Kunkerkar2 分钟阅读

In a modern cluster, the hardest problem isn’t running workloads—it's sharing resources fairly. Red Hat OpenShift clusters are seeing a surge of AI-accelerated workloads, from GPU-intensive training jobs to large batches of inference requests. At the same time, other tenants still need consistent throughput for their everyday CI/CD pipelines and data processing tasks. The result is a constant battle for resources, where some jobs wait too long, others consume more than their fair share, and administrators are left fighting bottlenecks.

This is exactly the challenge that Kueue, a Kubernetes-native job queueing and scheduling framework, was built to solve. It introduces structured queues, priorities, and quota enforcement to bring fairness and predictability back into scheduling.

With Red Hat Build of Kueue, these upstream innovations are packaged, hardened, and delivered into Red Hat OpenShift as a supported, enterprise-ready solution to enable clusters to run efficiently while giving every workload a fair chance.

Topology-aware scheduling

Once workloads are queued fairly, the next challenge is where they actually land. For distributed jobs, placement can matter as much as allocation: pods that constantly exchange data perform very differently depending on whether they're co-located or scattered across zones.

This is where topology-aware scheduling (TAS) comes in. Rather than treating the cluster as a flat pool of machines, TAS considers the physical and logical layout of the infrastructure (racks, blocks, zones) and makes scheduling decisions that optimize communication and efficiency. Workloads that talk a lot can be placed closer together, multi-pod jobs can start in sync through gang scheduling, and fairness across tenants is preserved even as locality is optimized.

In practice, this means model-training jobs finish sooner, build pipelines run with fewer bottlenecks, and overall cluster utilization improves. It's scheduling with awareness, and placement itself becomes a performance feature. Today TAS is in alpha, but it's already pointing to a future where Red Hat OpenShift clusters squeeze more efficiency out of the same hardware simply by being smarter about where workloads run.

Kueue with dynamic resource allocation

Placement is only one side of the story. Increasingly, workloads also depend on specialized hardware, such as graphics processing units (GPU), field-programmable gate arrays (FPGA), smart network interface cards (Smart NIC), and even GPU slices like the NVIDIA multi-instance GPU (MIG). These resources don't behave like generic CPUs. They need precise allocation, careful sharing, and guardrails to prevent monopolization.

Kueue's integration with the Kubernetes dynamic resource allocation (DRA) API addresses this gap. DRA creates a consistent framework for workloads to request and bind to devices, while Kueue extends its fairness and queueing model to manage those requests. Instead of static assignments or manual juggling, specialized hardware can now be orchestrated as cleanly as CPUs or memory.

For Red Hat OpenShift users, this means accelerators stop being exceptions and become core resources in the scheduling process. Tenants can trust that their jobs are able to get the devices they request, no one team can monopolize scarce hardware, and administrators gain predictability across the cluster. Like TAS, DRA support in Kueue is alpha and feature-gated.

Why it matters

Features like topology-aware scheduling (TAS) and dynamic resource allocation (DRA) are still in their early stages, but that's precisely what makes them interesting. These upstream efforts show where scheduling in Kubernetes, and by extension Red Hat OpenShift, is headed: Clusters that are aware of topology, hardware, and fairness across tenants.

As these capabilities mature upstream, they will continue to flow downstream through Red Hat Build of Kueue, tested, hardened, and integrated into OpenShift's enterprise-grade ecosystem. By keeping pace with Kueue's development today, OpenShift users can stay ahead of the curve and prepare for the next generation of intelligent workload management.

Next steps

To explore how these capabilities evolve downstream in Red Hat Build of Kueue, check out the following resources:

关于作者

Pannaga Rao Bhoja Ramamanohara

Software Engineer

Pannaga Rao is an engineer who loves working at the intersection of cloud, open source, and automation. When she’s not experimenting with Kubernetes operators or debugging controllers, she’s exploring new ways to make developer workflows simpler and smarter.

Read full bio

Sohan Kunkerkar

Senior Software Engineer

Sohan Kunkerkar is a Senior Software Engineer at Red Hat, currently working on Kueue and container runtimes. He contributes to CRI-O and the Kubernetes SIG-Node community, and has previously worked on KubeFed and Fedora CoreOS. He enjoys building reliable cloud-native platforms and sharing insights with the community.

Read full bio

了解更多

按频道浏览

探索所有频道

Behind the queues: How Kueue reimagines scheduling in Red Hat OpenShift

Topology-aware scheduling

Kueue with dynamic resource allocation

Why it matters

Next steps

红帽 OpenShift 容器平台 | 产品试用

关于作者

Pannaga Rao Bhoja Ramamanohara

Sohan Kunkerkar

更多此类内容

了解更多

按频道浏览

平台

工具

试用购买与出售

联系我们

关于红帽

Change page language

Red Hat legal and privacy links

Red Hat legal and privacy links