With our previous release of Red Hat OpenShift AI, we established a solid foundation for your enterprise AI infrastructure. Today, with the release of OpenShift AI 3.3, we are tackling the polarizing forces that often prevent AI projects from reaching production—the need for rigorous governance versus the demand for rapid developer access.
OpenShift AI 3.3 introduces a suite of tools designed to manage a centralized hub of AI assets while optimizing for the multimodel, multiagent future.
Centralized assets: The AI hub
As enterprises move beyond single-model use cases, discoverability becomes a bottleneck. Platform teams need a central source of truth for their AI assets—to register and version models before they are configured for deployment, and to view deployed models.
They also need guidance on how to best deploy these models—it is hard to assess the hardware requirements and to understand the latency and throughput to expect.
The AI hub aims to provide that—it is now the central repository for your organization's AI assets, starting from large language models (LLMs) in OpenShift AI 3.3 to Model Context Protocol (MCP) servers in future releases.
In OpenShift AI 3.3, AI hub provides performance insights and guidance from our Red Hat AI model validation program on the trade-offs of performance, cost, and hardware requirements. This helps platform teams steer developers toward the most efficient configurations before deployment begins.
Governance at scale: Model-as-a-Service (MaaS)
If you're configuring and managing your own GPUs and deploying AI models on them, building AI applications is tough. Most developers, AI engineers, and data scientists would rather start with an endpoint for a model that’s already up and running. Asking them to do all of this extra work slows them down, reduces time to value, and is neither scalable nor efficient, whether in terms of cost, time, or governance.
On the flip side, enabling platform teams to deliver these models to everyone—to equip their data scientists and business teams with the models they need—helps them extend the same paradigm they've been using for application platforms.
In this scenario, platform teams handle model serving and optimization, providing a centralized set of AI models that they can control through role-based access policies, setting usage limits and terms, and handling model versioning—all while end-users are given an API endpoint they can use to start happily building away.
OpenShift AI 3.3 brings a technical preview of MaaS designed to help organizations become their own internal AI model providers.
- For administrators: Define granular rate-limiting policies in the UI. For example, you can assign high quota access for smaller models used in daily tasks while placing stricter limits on resource-intensive frontier models.
- Optimized routing with
llm-d: This works in tandem with llm-d, the Kubernetes-native distributed inference framework. While you set the policies,llm-doptimizes request routing so your hardware is used as efficiently as possible without breaching service level agreements (SLAs).
Developer velocity: Gen AI studio
Models or assets deployed by platform teams need to be registered and surfaced centrally so AI engineers and developers can start building with them.
Developers also need a central place to experiment with these models and assets that allows a plug and play approach where they can find what model, prompt, or tool works best for their use case, while abstracting the complexity of the infrastructure needed to deploy them.
Our technical preview release of gen AI studio provides this playground and the tools developers need to move from a prompt to a pilot.
- AI playground: Experiment with prompts, model parameters, and MCP tools. In OpenShift AI 3.3, you can import your own MCP servers and toggle specific tools on or off, providing the determinism required for reliable agentic behavior. Moving from OpenShift AI UI to your local environment, the “View Code” function in OpenShift AI 3.3 lets you view and copy the playground configuration. Our upcoming roadmap builds on these foundations to enhance the AI engineer experience through exporting code, including prompt management, retrieval-augmented generation (RAG) capabilities, and refining MCP tool selection.
- AI asset endpoints: These enable you to retrieve API keys and endpoints instantly so you can start testing in your local IDE.
The production gap: Continuous evaluation and optimization
One of the largest barriers to deploying to production isn't building the model, it's managing costs and making sure quality doesn't drift.
- Cost optimization in model compression: OpenShift AI 3.3 introduces guided workbenches for LLM Compressor (GitHub) and GuideLLM (GitHub), open source tools led and used by Red Hat to benchmark and compress models as part of our model validation program. You can now benchmark a model, compress it (e.g., via quantization), and compare the performance gains directly within your environment. See more about the value of compressed models in this LLM Compressor blog post.
- Experiment tracking with MLflow: We are introducing a developer preview of MLflow integration. While compression and benchmarking help solve immediate performance issues, MLflow provides the "historical memory" for your AI lifecycle. By logging your
guidellmresults and application responses into MLflow, you track regressions and quality over time, so you can make sure your optimizations don't compromise accuracy. - Visualizing the loop: You can now see the direct correlation between your compression experiments and inference latency within the MLflow dashboard, making performance troubleshooting data-driven rather than anecdotal.
Try Red Hat OpenShift AI
The features in OpenShift AI 3.3 are designed to transform how you govern access to AI capabilities on the platform. You can experience AI hub, and preview gen AI studio and our new optimization workbenches by installing OpenShift AI 3.3. See our press release for more information.
You can also try OpenShift AI in the Red Hat product trial center. This gives you 60-day no-cost access to a fully managed environment where you can test these production-grade tools.
Produkttest
Red Hat OpenShift AI (selbst gemanagt) | Testversion
Über die Autoren
Jenny is a Technical Product Manager at Red Hat AI, where she focuses on the end-to-end platform experience for Red Hat AI Enterprise. She joined Red Hat through the Neural Magic acquisition, where she created user interfaces for LLM benchmarking and an AI control plane. Before moving into AI, she consulted for healthcare organizations and public health agencies, experiences that shape her focus on building AI tooling that supports practitioners in high-stakes, deeply specialized domains.
Jehlum is a Product Manager in the Red Hat AI team. She's focused on building platforms for generative AI applications. I am especially interested in data processing, observability, safety, evaluation - all key components to build production-grade generative AI applications on platforms that scale.
Taylor specializes in helping global enterprises transition Generative AI from experimental pilots to production-scale deployments. A specialist in large-scale inference and agentic systems, Taylor bridges the gap between complex infrastructure and practical application development. She is a dedicated advocate for open-source ecosystems, leveraging projects such as vLLM, llm-d and MLflow to build sovereign, secure, and observable AI stacks. Her work is centered on empowering organizations to reclaim control over their AI lifecycle through transparent and scalable open-source solutions.
Ähnliche Einträge
From network telemetry to operational intelligence
AI security: Identity and access control
Technically Speaking | Build a production-ready AI toolbox
Technically Speaking | Platform engineering for AI agents
Nach Thema durchsuchen
Automatisierung
Das Neueste zum Thema IT-Automatisierung für Technologien, Teams und Umgebungen
Künstliche Intelligenz
Erfahren Sie das Neueste von den Plattformen, die es Kunden ermöglichen, KI-Workloads beliebig auszuführen
Open Hybrid Cloud
Erfahren Sie, wie wir eine flexiblere Zukunft mit Hybrid Clouds schaffen.
Sicherheit
Erfahren Sie, wie wir Risiken in verschiedenen Umgebungen und Technologien reduzieren
Edge Computing
Erfahren Sie das Neueste von den Plattformen, die die Operations am Edge vereinfachen
Infrastruktur
Erfahren Sie das Neueste von der weltweit führenden Linux-Plattform für Unternehmen
Anwendungen
Entdecken Sie unsere Lösungen für komplexe Herausforderungen bei Anwendungen
Virtualisierung
Erfahren Sie das Neueste über die Virtualisierung von Workloads in Cloud- oder On-Premise-Umgebungen