With our previous release of Red Hat OpenShift AI, we established a solid foundation for your enterprise AI infrastructure. Today, with the release of OpenShift AI 3.3, we are tackling the polarizing forces that often prevent AI projects from reaching production—the need for rigorous governance versus the demand for rapid developer access.

OpenShift AI 3.3 introduces a suite of tools designed to manage a centralized hub of AI assets while optimizing for the multimodel, multiagent future.

Centralized assets: The AI hub

As enterprises move beyond single-model use cases, discoverability becomes a bottleneck. Platform teams need a central source of truth for their AI assets—to register and version models before they are configured for deployment, and to view deployed models. 

They also need guidance on how to best deploy these models—it is hard to assess the hardware requirements and to understand the latency and throughput to expect. 

The AI hub aims to provide that—it is now the central repository for your organization's AI assets, starting from large language models (LLMs) in OpenShift AI 3.3 to Model Context Protocol (MCP) servers in future releases.

In OpenShift AI 3.3, AI hub provides performance insights and guidance from our Red Hat AI model validation program on the trade-offs of performance, cost, and hardware requirements. This helps platform teams steer developers toward the most efficient configurations before deployment begins.

Governance at scale: Model-as-a-Service (MaaS)

If you're configuring and managing your own GPUs and deploying AI models on them, building AI applications is tough. Most developers, AI engineers, and data scientists would rather start with an endpoint for a model that’s already up and running. Asking them to do all of this extra work slows them down, reduces time to value, and is neither scalable nor efficient, whether in terms of cost, time, or governance. 

On the flip side, enabling platform teams to deliver these models to everyone—to equip their data scientists and business teams with the models they need—helps them extend the same paradigm they've been using for application platforms. 

In this scenario, platform teams handle model serving and optimization, providing a centralized set of AI models that they can control through role-based access policies, setting usage limits and terms, and handling model versioning—all while end-users are given an API endpoint they can use to start happily building away.

OpenShift AI 3.3 brings a technical preview of MaaS designed to help organizations become their own internal AI model providers.

  • For administrators: Define granular rate-limiting policies in the UI. For example, you can assign high quota access for smaller models used in daily tasks while placing stricter limits on resource-intensive frontier models.
  • Optimized routing with llm-d: This works in tandem with llm-d, the Kubernetes-native distributed inference framework. While you set the policies, llm-d optimizes request routing so your hardware is used as efficiently as possible without breaching service level agreements (SLAs).

Developer velocity: Gen AI studio

Models or assets deployed by platform teams need to be registered and surfaced centrally so AI engineers and developers can start building with them.

Developers also need a central place to experiment with these models and assets that allows a plug and play approach where they can find what model, prompt, or tool works best for their use case, while abstracting the complexity of the infrastructure needed to deploy them. 

Our technical preview release of gen AI studio provides this playground and the tools developers need to move from a prompt to a pilot.

  • AI playground: Experiment with prompts, model parameters, and MCP tools. In OpenShift AI 3.3, you can import your own MCP servers and toggle specific tools on or off, providing the determinism required for reliable agentic behavior. Moving from OpenShift AI UI to your local environment, the “View Code” function in OpenShift AI 3.3 lets you view and copy the playground configuration. Our upcoming roadmap builds on these foundations to enhance the AI engineer experience through exporting code, including prompt management, retrieval-augmented generation (RAG) capabilities, and refining MCP tool selection.
  • AI asset endpoints: These enable you to retrieve API keys and endpoints instantly so you can start testing in your local IDE.

The production gap: Continuous evaluation and optimization

One of the largest barriers to deploying to production isn't building the model, it's managing costs and making sure quality doesn't drift.

  • Cost optimization in model compression: OpenShift AI 3.3 introduces guided workbenches for LLM Compressor (GitHub) and GuideLLM (GitHub), open source tools led and used by Red Hat to benchmark and compress models as part of our model validation program. You can now benchmark a model, compress it (e.g., via quantization), and compare the performance gains directly within your environment. See more about the value of compressed models in this LLM Compressor blog post.
  • Experiment tracking with MLflow: We are introducing a developer preview of MLflow integration. While compression and benchmarking help solve immediate performance issues, MLflow provides the "historical memory" for your AI lifecycle. By logging your guidellm results and application responses into MLflow, you track regressions and quality over time, so you can make sure your optimizations don't compromise accuracy.
  • Visualizing the loop: You can now see the direct correlation between your compression experiments and inference latency within the MLflow dashboard, making performance troubleshooting data-driven rather than anecdotal.

Try Red Hat OpenShift AI 

The features in OpenShift AI 3.3 are designed to transform how you govern access to AI capabilities on the platform. You can experience AI hub, and preview gen AI studio and our new optimization workbenches by installing OpenShift AI 3.3. See our press release for more information.

You can also try OpenShift AI in the Red Hat product trial center. This gives you 60-day no-cost access to a fully managed environment where you can test these production-grade tools.

Product trial

Red Hat OpenShift AI (Self-Managed) | Product Trial

An open source machine learning (ML) platform for the hybrid cloud.

About the authors

Jenny is a Technical Product Manager at Red Hat AI, where she focuses on the end-to-end platform experience for Red Hat AI Enterprise. She joined Red Hat through the Neural Magic acquisition, where she created user interfaces for LLM benchmarking and an AI control plane. Before moving into AI, she consulted for healthcare organizations and public health agencies, experiences that shape her focus on building AI tooling that supports practitioners in high-stakes, deeply specialized domains.

Jehlum is a Product Manager in the Red Hat AI team. She's focused on building platforms for generative AI applications. I am especially interested in data processing, observability, safety, evaluation - all key components to build production-grade generative AI applications on platforms that scale.

UI_Icon-Red_Hat-Close-A-Black-RGB

Browse by channel

automation icon

Automation

The latest on IT automation for tech, teams, and environments

AI icon

Artificial intelligence

Updates on the platforms that free customers to run AI workloads anywhere

open hybrid cloud icon

Open hybrid cloud

Explore how we build a more flexible future with hybrid cloud

security icon

Security

The latest on how we reduce risks across environments and technologies

edge icon

Edge computing

Updates on the platforms that simplify operations at the edge

Infrastructure icon

Infrastructure

The latest on the world’s leading enterprise Linux platform

application development icon

Applications

Inside our solutions to the toughest application challenges

Virtualization icon

Virtualization

The future of enterprise virtualization for your workloads on-premise or across clouds