Webinar

One Model, Many Teams: Optimizing Mistral Delivery via OpenShift AI

Jump to section

This session breaks down the journey of moving from fragmented deployments to a scalable, centralized AI service using Red Hat OpenShift AI (RHOAI). We will go behind the scenes of a real-world PoC to show how to deliver high-performance models efficiently across multiple departments, balancing developer speed with enterprise-grade governance.

Key Takeaways:

  • The Model Decision Framework: How to identify and validate general-purpose models (e.g., Mistral 24B vs. 7B) using data-driven benchmarks to meet diverse business needs.
  • Optimizing Infrastructure: Leveraging GPUs and vLLM to maximize performance through techniques like quantization and memory allocation tuning.
  • Resource Sharing at Scale: Best practices for configuring OpenShift namespaces, RBAC, and quotas to allow multiple teams to share GPU resources.
  • Enterprise Integration: Strategies for exposing optimized models via secure APIs.

Join us to learn how to transform your AI strategy from a series of experiments into a highly available  service that can be leveraged by various different teams in your organization.

Speaker


Anne Faulhaber

Anne Faulhaber

Technical Account Manager, Red Hat

Two overlapping 3D pink speech bubbles with sparkles, symbolizing communication

Explore more events and connect with our TAM