Webinar

Accelerate Your Gen AI Transformation with Optimized LLM Inferencing

Jump to section

OVERVIEW

Generative AI technology demands significant computing resources, particularly GPUs. Unlike traditional technologies, resource consumption scales directly with the number of end users. As user adoption increases, the operational cost of maintaining generative AI services also rises sharply. This makes it difficult to sustain cost-efficient operations, as a growing user base inevitably drives up infrastructure expenses. For many enterprises, this becomes a major challenge in accelerating their AI transformation journey.

The Red Hat AI Inference Server helps organizations running AI-enabled applications and services reduce operational costs and maximize hardware utilization. Built on vLLM inference technology, the Red Hat AI Inference Server leverages innovations from a rapidly evolving upstream community—where Red Hat is a key contributor. It enables customers to serve optimized AI inference models efficiently on their on-premises infrastructure, backed by Red Hat’s enterprise-grade support.

The vLLM technology introduces innovative capabilities such as Paged Attention and Continuous Batching, which significantly enhance inference performance. These features deliver higher throughput, lower latency, and improved efficiency, allowing large language models (LLMs) to generate responses with a smaller computational footprint.

You will learn about how to integrate your own Generative AI application with Red Hat AI Inference Server with ease.

The following topics will be covered in this webinar:

  • What are the key challenges in scaling AI applications?
  • How can these challenges be addressed to meet large-scale demand?
  • What are the core innovations behind vLLM technology?
  • How does the Red Hat AI Inference Server (RHAIIS) enable efficient AI model inferencing?

Any questions? please contact Sylvia A


Kyung Huh

Kyung Huh

Senior Technical Account Manager, Red Hat

Kyung Huh is a seasoned IT professional with over 25 years of experience in the open-source software industry. His background includes extensive work as a Red Hat Certified Instructor, delivering a wide range of Red Hat training courses. He has also served as a consultant, supporting customers in Red Hat Enterprise Linux operations, performance optimization, and virtualization infrastructure design. In his current role as a Technical Account Manager, Kyung draws on his deep expertise to help customers maintain seamless IT operations and successfully adopt emerging technologies with confidence and efficiency