Événement en personne

Red Hat at NVIDIA GTC 2025

18 mars 2025 - 21 mars 2025 San Jose, CaliforniaSan Jose McEnery Convention Center
Event menu

Red Hat and NVIDIA: Bringing AI to the Enterprise

March 18 - 21, 2025

Are you attending NVIDIA GTC 2025? Visit Red Hat at booth 647 to meet with our open source experts and learn how Red Hat and NVIDIA partner to deliver the latest innovations, sharing an AI vision fueled by open source, cloud-native technologies.

Speaking Sessions

Tuesday, March 18

5:20 pm - 5:35 pm

Smarter, Not Bigger: How Small Language Models With RAG and Fine-Tuning can Deliver Better Results at Lower Cost with Red Hat AI [EXS74230]

(SJCC Hall 3 Theater (L2))

Retrieval-augmented generation (RAG) helps align Gen AI apps to customer use cases, but large models can be expensive, and RAG has limitations. We’ll explore how Red Hat AI can provide a better approach with RHEL AI and OpenShift AI: • Fine-tune smaller language models with customer datasets for better accuracy and lower costs • Introduce LAB: Large-Scale Alignment for ChatBots, a novel approach for instruction alignment and fine-tuning • Achieve significant cost reduction with RAG + LAB-tuned small language models We'll demo LAB and discuss real customers who've benefited from this approach with Red Hat AI and OpenShift AI.

Tushar Katarki,

Sr. Director of Product Management, Red Hat

Wednesday, March 19

8:00 am - 8:40 am

Low-Precision Inference in vLLM [S72114]

(SJCC 212A (L2))

vLLM, now the de facto standard for open-source LLM serving, plays a pivotal role in driving the widespread adoption of open-weight LLMs. As enterprises and the broader community seek to reduce deployment costs and maximize inference performance, optimizing model execution time becomes essential. We'll explore how vLLM achieves significant performance gains, focusing on the quantized linear layers that deliver dramatically faster inference. We'll deep dive into the implementation and performance of vLLM's quantized linear layers, explaining how they enable both compute speedup and memory compression. We'll examine the computational regimes where different quantization strategies excel and relate that to different real-world LLM inference server workloads. Finally, we'll look at some of the advanced optimizations in vLLM’s mixed-input Machete kernels, implemented via NVIDIA CUTLASS, and discuss how these innovations pave the way for high-performance LLM serving.

Lucas Wilkinson,

Principal HPC Engineer, Neural Magic

Tyler Michael Smith,

Technical Director, Neural Magic

Thursday, March 20

4:00 pm - 4:20 pm

Enable AI-Native Networking for Telcos with Kubernetes [S72993]

(SJCC 212B (L2))

As telephone companies transform their networks for the age of AI, utilizing the Bluefield 3 (BF3) DPU engines has been challenging for developers and clients alike. Yet, BF3 is a critical part in deploying and securing an accelerated compute cluster to enable this networking-for-AI infrastructure. The DOCA Platform Framework (DPF) simplifies this task, providing a framework for LCM (life cycle management) and provisioning of both the BF3, as a platform, and the services running on it as K8 containers. DPF is deployed via two network operators in the K8 environment that allow you to deploy and service chain NVIDIA and third-party services. With this, for the first time, independent software vendors (ISVs), operating systems (OS) vendors, and developers can deploy and orchestrate services with ease, as well as onboarding these tools to their environment. In this session, OS and ISV partner companies who've adopted the DPF will share their experience, what they are able to achieve, and what comes next.

Erwan Gallen,

Senior Principal Product Manager, Red Hat

Friday, March 21

10:00 am - 10:40 am

How can OpenShift AI and NVIDIA NIM Help You Accelerate and Optimize GenAI Application Development [S71729]

(SJCC 210B (L2))

In this session, we'll explore how Red Hat OpenShift AI empowers developers to create and deliver AI-enabled applications at scale across hybrid cloud environments. By offering native support for NVIDIA NIM, OpenShift AI unlocks seamless integration with state-of-the-art NVIDIA technologies, ensuring optimized performance. Learn how you can leverage NVIDIA NIM to deploy Generative AI (GenAI) applications on OpenShift AI for maximum efficiency and flexibility. This session will demonstrate the advantages of combining these platforms, including enhanced portability, enterprise-level security, and scalable deployment across both private and public clouds.

Tomer Figenblat,

Senior Software Engineer, Red Hat

Babak Mozaffari,

Distinguished Engineer & Director, Red Hat

On Demand

Smaller Language Model with RAG and Fine-Tuning Gets Better Results and Reduces Costs With Red Hat AI [S74237 ]

(Virtual)

Retrieval-augmented generation (RAG) helps align Gen AI apps to customer use cases, but large models can be expensive, and RAG has limitations. We’ll explore how Red Hat AI can provide a better approach with RHEL AI and OpenShift AI: • Fine-tune smaller language models with customer datasets for better accuracy and lower costs • Introduce LAB: Large-Scale Alignment for ChatBots, a novel approach for instruction alignment and fine-tuning • Achieve significant cost reduction with RAG + LAB-tuned small language models We'll demo LAB and discuss real-world customers who have benefited from this approach with Red Hat AI and OpenShift AI.

Akash Srivastava,

Manager, AI Innovation, Red Hat

Tushar Katarki,

Sr. Director of Product Management, Red Hat

Icon-Red_Hat-Media_and_documents-Quotemark_Open-B-Red-RGB Customers know that they've got to embark on this journey to apply AI to transform their business... And so, we built the LaunchPad program to give them instant access to AI servers with Red Hat OpenShift, with the MLOps tooling.

Justin Boitano

Vice President of Enterprise and Edge Computing, NVIDIA

Hybrid Cloud Ready

cisco partner logo

Together, Red Hat OpenShift, Red Hat Enterprise Linux®, and NVIDIA BlueField DPUs provide a consistent, cloud-native application platform to manage hybrid cloud, multicloud, and edge deployments with enhanced orchestration, automation, and a focus on security.

 

Test and run Red Hat OpenShift on the NVIDIA BlueField DPU

Watch an on-demand session: Accelerating Kubernetes Hybrid Clouds with BlueField DPUs and OpenShift for Ultimate Security and Efficiency

Get popcorn and swag!

Visit the Red Hat booth 647 during exhibit hours to get your very own Red Hat stadium scarf and join us in the Cesar Chavez Park for popcorn on us!

Show Information

Show Location: San Jose McEnery Convention Center

Show Address:

      150 West San Carlos Street

      San Jose, CA 95113

Show Dates: March 18-21, 2025

Show Hours (PST):

      Tuesday, March 18: 1:00 PM – 7:00  PM

      Wednesday, March 19: 12:00 PM – 7:00  PM

      Thursday, March 20: 12:00 PM – 7:00  PM

      Friday, March 21: 11:00 PM – 2:00  PM

Red Hat Booth: 647

Resources

Open Unlocks the World's Potential

At Red Hat, our commitment to open source extends beyond technology into virtually everything we do. We collaborate and share ideas, create inclusive communities, and welcome diverse perspectives from all Red Hatters, no matter their role. It’s what makes us who we are. Join us. At Red Hat, opportunities are open

Social

Follow along on Twitter