Get started with AI Inference

This e-book introduces the fundamentals of inference performance engineering and model optimization. The e-book focuses on quantization, sparsity, and other techniques that help reduce compute and memory requirements for efficient inference. It also outlines the advantages of using a Red Hat open approach, validated model repository, and tools such as the large language model (LLM) Compressor and Red Hat® AI Inference.

 

Front cover of Get started with AI Inference ebook featuring a large stylized 3D star