Red Hat Performance and Scale Engineering

Analyses from Red Hat Performance and Scale Engineering validate scalability, expose bottlenecks, and optimize Kubernetes, AI, and hybrid cloud workloads.

Red Hat AI

Illustration of icon with circuit lines extending outward

Benchmarking AI inference on CPUs: A transparent blueprint for the enterprise

May 28, 2026 | Maryam Tahhan , John Harrigan, Anton Ivanov, Paul Power, Luigi Mario Zuccarelli

As enterprises look to optimize the total cost of ownership (TCO) of Large Language Model deployment, utilizing existing enterprise CPU infrastructure alongside GPU resources for specific inference workloads has become a strategic initiative. However, infrastructure teams attempting to validate this face a chaotic benchmarking landscape.

Red Hat Performance and Scale Engineering

Benchmarking AI inference on CPUs: A transparent blueprint for the enterprise

LogAn: Large-scale log analysis with small language models

Performance improvements with speculative decoding in vLLM for gpt-oss

Red Hat and NVIDIA: Setting standards for high-performance AI inference

Red Hat AI tops MLPerf Inference v6.0 with vLLM on Qwen3-VL, Whisper, and GPT-OSS-120B

Configure NVIDIA Blackwell GPUs for Red Hat AI workloads | Red Hat Developer

5 steps to triage vLLM performance

Estimate GPU memory for LLM fine-tuning with Red Hat AI

How to deploy and benchmark vLLM with GuideLLM on Kubernetes

Integrate a custom AI service with Red Hat Ansible Lightspeed

Autoscaling vLLM with OpenShift AI model serving: Performance validation

Deploy an LLM inference service on OpenShift AI

Efficient and reproducible LLM inference: Inside Red Hat’s MLPerf Inference v5.1 submissions

Krkn-AI: A feedback-driven approach to chaos engineering

Network performance in distributed training: Maximizing GPU utilization on OpenShift

vLLM or llama.cpp: Choosing the right LLM inference engine for your use case

How to set up KServe autoscaling for vLLM with KEDA

Ollama vs. vLLM: A deep dive into performance benchmarking

How we improved AI inference on macOS Podman containers

How to run performance and scale validation for OpenShift AI

Performance boosts in vLLM 0.8.1: Switching to the V1 engine

Supercharge your AI with OpenShift AI and Redis: Unleash speed and scalability

Unlocking the Effective Context Length: Benchmarking the Granite-3.1-8b Model

RoCE multi-node AI training on Red Hat OpenShift

Red Hat Enterprise Linux Performance Results on Intel® Xeon® 6 processors

RHEL for Real Time: CPU throttling and risks

Best Practice Configuration and Tuning for Linux and Windows VMs

A deep dive into OpenShift Container Platform 4.20 performance

How to run performance tests using benchmark-runner

High Scale Performance Testing: Virt Density

How to run I/O workloads on OpenShift Virtualization VMs

A case study in Kubelet regression in OpenShift

How Red Hat has redefined continuous performance testing

Scaling OpenShift Network Policies: Results and Takeaways

Improving performance of multiple I/O threads for OpenShift Virtualization

OpenShift LACP bonding performance expectations

Feature Introduction: Multiple IOthreads for OpenShift Virtualization

Dynamic VM CPU Workload Rebalancing with Load Aware Descheduler

Evaluating memory overcommitment in OpenShift Virtualization

Boost OpenShift database VM density with memory overcommit

Scalable Database Performance with OpenShift Virtualization, Out-of-the-Box

How to enable Ansible Lightspeed intelligent assistant

Monitoring Red Hat Ansible Automation Platform using Performance Co-Pilot

Why your database benchmarking data is probably wrong (and how I fixed mine)

Extending the Chaos: A Guide to Building Custom Scenarios for Krkn

BGP dynamic routing with Fast Data Path on RHOSO 18

Unleash controlled chaos with krknctl

Enhancing system resilience with Krkn chaos dashboard

Scaling OpenShift Network Policies: Our Journey in Developing a Robust Workload Testing Tool

Platforms

Tools

Try, buy, & sell

Communicate

About Red Hat

Change page language

Red Hat legal and privacy links

Red Hat legal and privacy links