Generative AI on Kubernetes: Operationalizing large language models
Generative AI (gen AI) is revolutionizing industries, and Kubernetes has fast become the backbone for deploying and managing these resource-intensive workloads. This O'Reilly e-book serves as a practical guide for MLOps engineers, developers, Kubernetes administrators, and AI professionals looking to combine AI innovation with cloud-native infrastructure.
Authors Roland Huß and Daniele Zonca provide a clear roadmap for training, fine-tuning, deploying, and scaling genAI models on Kubernetes while addressing challenges like resource optimization, automation, and security. Through real-world examples and actionable insights, readers will learn how to manage genAI applications in production and operationalize AI workloads at scale.