Name: Red Hat at NVIDIA GTC 2025
Start: 2025-03-18T16:00:00
End: 2025-03-21T21:00:00
Location: San Jose McEnery Convention Center

Red Hat and NVIDIA: Bringing AI to the Enterprise

March 18 - 21, 2025

Are you attending NVIDIA GTC 2025? Visit Red Hat at booth 647 to meet with our open source experts and learn how Red Hat and NVIDIA partner to deliver the latest innovations, sharing an AI vision fueled by open source, cloud-native technologies.

Speaking Sessions

Tuesday, March 18
5:20 pm - 5:35 pm	Smarter, Not Bigger: How Small Language Models With RAG and Fine-Tuning can Deliver Better Results at Lower Cost with Red Hat AI [EXS74230] (SJCC Hall 3 Theater (L2)) Retrieval-augmented generation (RAG) helps align Gen AI apps to customer use cases, but large models can be expensive, and RAG has limitations. We’ll explore how Red Hat AI can provide a better approach with RHEL AI and OpenShift AI: • Fine-tune smaller language models with customer datasets for better accuracy and lower costs • Introduce LAB: Large-Scale Alignment for ChatBots, a novel approach for instruction alignment and fine-tuning • Achieve significant cost reduction with RAG + LAB-tuned small language models We'll demo LAB and discuss real customers who've benefited from this approach with Red Hat AI and OpenShift AI. Tushar Katarki, Sr. Director of Product Management, Red Hat

Add to your agenda

Wednesday, March 19
8:00 am - 8:40 am	Low-Precision Inference in vLLM [S72114] (SJCC 212A (L2)) vLLM, now the de facto standard for open-source LLM serving, plays a pivotal role in driving the widespread adoption of open-weight LLMs. As enterprises and the broader community seek to reduce deployment costs and maximize inference performance, optimizing model execution time becomes essential. We'll explore how vLLM achieves significant performance gains, focusing on the quantized linear layers that deliver dramatically faster inference. We'll deep dive into the implementation and performance of vLLM's quantized linear layers, explaining how they enable both compute speedup and memory compression. We'll examine the computational regimes where different quantization strategies excel and relate that to different real-world LLM inference server workloads. Finally, we'll look at some of the advanced optimizations in vLLM’s mixed-input Machete kernels, implemented via NVIDIA CUTLASS, and discuss how these innovations pave the way for high-performance LLM serving. Lucas Wilkinson, Principal HPC Engineer, Neural Magic Tyler Michael Smith, Technical Director, Neural Magic

Add to your agenda

Thursday, March 20
4:00 pm - 4:20 pm	Enable AI-Native Networking for Telcos with Kubernetes [S72993] (SJCC 212B (L2)) As telephone companies transform their networks for the age of AI, utilizing the Bluefield 3 (BF3) DPU engines has been challenging for developers and clients alike. Yet, BF3 is a critical part in deploying and securing an accelerated compute cluster to enable this networking-for-AI infrastructure. The DOCA Platform Framework (DPF) simplifies this task, providing a framework for LCM (life cycle management) and provisioning of both the BF3, as a platform, and the services running on it as K8 containers. DPF is deployed via two network operators in the K8 environment that allow you to deploy and service chain NVIDIA and third-party services. With this, for the first time, independent software vendors (ISVs), operating systems (OS) vendors, and developers can deploy and orchestrate services with ease, as well as onboarding these tools to their environment. In this session, OS and ISV partner companies who've adopted the DPF will share their experience, what they are able to achieve, and what comes next. Erwan Gallen, Senior Principal Product Manager, Red Hat

Add to your agenda

Friday, March 21
10:00 am - 10:40 am	How can OpenShift AI and NVIDIA NIM Help You Accelerate and Optimize GenAI Application Development [S71729] (SJCC 210B (L2)) In this session, we'll explore how Red Hat OpenShift AI empowers developers to create and deliver AI-enabled applications at scale across hybrid cloud environments. By offering native support for NVIDIA NIM, OpenShift AI unlocks seamless integration with state-of-the-art NVIDIA technologies, ensuring optimized performance. Learn how you can leverage NVIDIA NIM to deploy Generative AI (GenAI) applications on OpenShift AI for maximum efficiency and flexibility. This session will demonstrate the advantages of combining these platforms, including enhanced portability, enterprise-level security, and scalable deployment across both private and public clouds. Tomer Figenblat, Senior Software Engineer, Red Hat Babak Mozaffari, Distinguished Engineer & Director, Red Hat

Add to your agenda

On Demand
	Smaller Language Model with RAG and Fine-Tuning Gets Better Results and Reduces Costs With Red Hat AI [S74237 ] (Virtual) Retrieval-augmented generation (RAG) helps align Gen AI apps to customer use cases, but large models can be expensive, and RAG has limitations. We’ll explore how Red Hat AI can provide a better approach with RHEL AI and OpenShift AI: • Fine-tune smaller language models with customer datasets for better accuracy and lower costs • Introduce LAB: Large-Scale Alignment for ChatBots, a novel approach for instruction alignment and fine-tuning • Achieve significant cost reduction with RAG + LAB-tuned small language models We'll demo LAB and discuss real-world customers who have benefited from this approach with Red Hat AI and OpenShift AI. Akash Srivastava, Manager, AI Innovation, Red Hat Tushar Katarki, Sr. Director of Product Management, Red Hat

Watch now

Customers know that they've got to embark on this journey to apply AI to transform their business... And so, we built the LaunchPad program to give them instant access to AI servers with Red Hat OpenShift, with the MLOps tooling.

Justin Boitano

Vice President of Enterprise and Edge Computing, NVIDIA

Hybrid Cloud Ready

Together, Red Hat OpenShift, Red Hat Enterprise Linux®, and NVIDIA BlueField DPUs provide a consistent, cloud-native application platform to manage hybrid cloud, multicloud, and edge deployments with enhanced orchestration, automation, and a focus on security.

Test and run Red Hat OpenShift on the NVIDIA BlueField DPU

Watch an on-demand session: Accelerating Kubernetes Hybrid Clouds with BlueField DPUs and OpenShift for Ultimate Security and Efficiency

Featured Products

Want to learn more about the latest Red Hat technologies? We will have open source experts showcasing:

Red Hat Enterprise Linux AI

Red Hat® Enterprise Linux® AI is a foundation model platform to seamlessly develop, test, and run large language models (LLMs) for enterprise applications.

See what's new

Red Hat OpenShift AI

Red Hat® OpenShift® AI is a flexible, scalable MLOps platform with tools to build, deploy, and manage AI-enabled applications. Built using open source technologies, it provides trusted, operationally consistent capabilities for teams to experiment, serve models, and deliver innovative apps.

See more here

Get popcorn and swag!

Visit the Red Hat booth 647 during exhibit hours to get your very own Red Hat stadium scarf and join us in the Cesar Chavez Park for popcorn on us!

Show Information

Show Location: San Jose McEnery Convention Center

Show Address:

150 West San Carlos Street

San Jose, CA 95113

Show Dates: March 18-21, 2025

Show Hours (PST):

Tuesday, March 18: 1:00 PM – 7:00 PM

Wednesday, March 19: 12:00 PM – 7:00 PM

Thursday, March 20: 12:00 PM – 7:00 PM

Friday, March 21: 11:00 PM – 2:00 PM

Red Hat Booth: 647

Additional Show information cran be found here

Resources

Edge

AI/ML

High Performance Computing

High performance computing with Red Hat

Ceph Storage

Data pipeline with Ceph notifications and KNative Serving

OpenShift

Red Hat and NVIDIA

Open Unlocks the World's Potential

At Red Hat, our commitment to open source extends beyond technology into virtually everything we do. We collaborate and share ideas, create inclusive communities, and welcome diverse perspectives from all Red Hatters, no matter their role. It’s what makes us who we are. Join us. At Red Hat, opportunities are open

Red Hat and NVIDIA: Bringing AI to the Enterprise

Speaking Sessions

Tuesday, March 18

Smarter, Not Bigger: How Small Language Models With RAG and Fine-Tuning can Deliver Better Results at Lower Cost with Red Hat AI [EXS74230]

Wednesday, March 19

Low-Precision Inference in vLLM [S72114]

Thursday, March 20

Enable AI-Native Networking for Telcos with Kubernetes [S72993]

Friday, March 21

How can OpenShift AI and NVIDIA NIM Help You Accelerate and Optimize GenAI Application Development [S71729]

On Demand

Smaller Language Model with RAG and Fine-Tuning Gets Better Results and Reduces Costs With Red Hat AI [S74237 ]

Hybrid Cloud Ready

Featured Products

Red Hat Enterprise Linux AI

Red Hat OpenShift AI

Get popcorn and swag!

Show Information

Resources

Edge

AI/ML

High Performance Computing

Ceph Storage

OpenShift

Red Hat and NVIDIA

Open Unlocks the World's Potential

Plateformes

Outils

Essayer, acheter et vendre

Communication

About Red Hat

Change page language

Red Hat legal and privacy links

Red Hat legal and privacy links