You've picked a model. Maybe it's a 70 billion parameter large model because someone on the team saw it top a leaderboard. Now you need it running in production on your Red Hat OpenShift AI cluster. So you start tuning batch sizes, figuring out quantization, sizing GPU requests, writing Kubernetes manifests, and hoping the out of memory errors stop before your deadline hits.
We've watched this play out enough times to see the pattern. The hard part of enterprise AI isn't just picking a model, it's the stretch between "this model looks good" and "this model is serving traffic reliably." That stretch eats weeks, sometimes months. Project Navigator is part of our strategy to make this process shorter.
The operational bottleneck
The AI industry has a model problem, but probably not the one you're thinking of. There are thousands of AI models available today. The tooling to train and fine-tune them keeps getting better. But getting a model deployed well, on hardware that fits, with a configuration that doesn't waste money? Typically, that still involves a lot of manual work.
GPU utilization at most organizations is well below what the hardware can do. Cost overruns on AI infrastructure are common. The people who know how to size and tune these systems are hard to find.
The pain shows up differently depending on your role. If you're an AI engineer, you spend weeks benchmarking models and debugging memory errors before a single request gets served. If you're a platform architect, you're watching expensive GPUs sit half-idle because nobody sized the workloads properly. And if you're the product owner, you can't tell whether a deployment will be cost effective until it's already running and the invoices start arriving.
What is Project Navigator?
Project Navigator is a layer on top of OpenShift AI that connects to your cluster, sees what's running and what's available, and helps you make better choices about model selection and deployment. You talk to it in plain language. Tell it what you're building ("I want a retrieval-augmented generation app for our internal knowledge base for 20 concurrent users with a max of 1.5s latency"), and it works with your cluster's actual state—the models in your catalog, the hardware you have, the benchmarks that matter for your use case.
Project Navigator is currently available in OpenShift AI 3.4 as a developer preview with 2 capabilities.
1. Intelligent model selection
Teams routinely pick models based on name recognition or leaderboard rankings without checking whether a smaller model can handle their task as or more effectively. Navigator changes this by matching what you describe against a collection of benchmark data (MMLU, HumanEval, and others) and recommending the best fit from your own model catalog.
The results can be surprising. For a code-heavy workload, an 8 billion parameter model may outperform a general purpose 70 billion parameter model on the benchmarks that actually matter, while using a fraction of the GPU resources. Navigator puts those comparisons in front of you so you pick based on evidence, not habit. It offers cost centric, performance centric, and blended recommendations.
2. Optimized model deployment
Once you've selected a model, Navigator generates Kubernetes manifests tailored to your cluster. That means a KServe InferenceService spec with resource requests and limits, a HorizontalPodAutoscaler with scaling rules, and a ServiceMonitor for Prometheus-based observability. All sized against what your cluster actually has, not generic defaults.
The gap between a properly configured deployment and a default one is large, both in hardware utilization and cost. Navigator closes it without requiring your team to become infrastructure specialists.
How it's built
Navigator is built on an open integration standard called the Model Context Protocol (MCP). This means it can meet teams where they work. If your engineers already use tools like Claude Code, Cursor, or Gemini CLI, they can access Navigator's capabilities directly from those environments. As more tools adopt MCP, Navigator will work with those as well, and as we bring Navigator's capabilities into Red Hat OpenShift AI, teams will have even more ways to access them.
The project is open source, developed upstream in the Open Data Hub community under the name RHOAI MCP Server.
Get involved
Project Navigator is in developer preview. We shipped it early because we want feedback from teams dealing with these problems right now.
Grab the code and try it: RHOAI MCP Server (Project Navigator)
If you're running OpenShift AI and spending too much time on model selection or deployment tuning, we'd like to hear from you. File issues, open pull requests, or tell us what's broken. We're building this in the open for a reason.
Prova prodotto
Red Hat OpenShift AI (autogestito) | Versione di prova del prodotto
Sugli autori
Suhas Kashyap is a Product Manager on the Red Hat OpenShift AI team, where he focuses on AI/ML platform capabilities including model customization, RAG, and developer tooling. He brings over 22 years of software industry experience spanning development, architecture, and DevOps.
Before joining Red Hat, Suhas spent 9.5 years at IBM in AI Product Management, where he shipped the AI Toolkit for IBM Z and LinuxONE and worked extensively on model customization and advanced RAG capabilities within watsonx.ai.
Outside of work, Suhas is an avid cricketer, half-marathon runner, amateur photographer, and self-described lawn care enthusiast.
Amit Oren is a Principal Engineer at Red Hat, where he works on OpenShift AI and serves as a maintainer of the llm-d Planner project. His work focuses on building scalable AI platforms, inference systems, and cloud-native technologies that help bring AI into production.
Before joining Red Hat, Amit worked at Amazon Web Services (AWS), where he contributed to large-scale cloud systems. Over the course of his career, he has designed and delivered distributed systems spanning cloud infrastructure, networking, data storage, and software architecture.
He holds a broad technical background in artificial intelligence, Kubernetes and cloud-native platforms, infrastructure as code, operating systems, networking protocols, and distributed systems engineering.`
Paul van Run is an Architect on the Red Hat OpenShift AI team, specializing in observability and AI tooling. He brings over 30 years of software industry experience, including Director and CTO roles at startup companies and extensive work at IBM and Red Hat in software development, cloud, and AI technologies.
Outside of work, Paul enjoys playing goalkeeper in soccer, woodworking and laser cutting, furniture restoration, and hands-on home renovation projects.
Altri risultati simili a questo
L'IA agentica richiede nuove tecnologie per l’infrastruttura: AMD e Red Hat rispondono a questa esigenza
Smetti di gestire il passato e inizia a costruire il futuro dell'IT
Technically Speaking | Inside open source AI strategy
Technically Speaking | Build a production-ready AI toolbox
Ricerca per canale
Automazione
Novità sull'automazione IT di tecnologie, team e ambienti
Intelligenza artificiale
Aggiornamenti sulle piattaforme che consentono alle aziende di eseguire carichi di lavoro IA ovunque
Hybrid cloud open source
Scopri come affrontare il futuro in modo più agile grazie al cloud ibrido
Sicurezza
Le ultime novità sulle nostre soluzioni per ridurre i rischi nelle tecnologie e negli ambienti
Edge computing
Aggiornamenti sulle piattaforme che semplificano l'operatività edge
Infrastruttura
Le ultime novità sulla piattaforma Linux aziendale leader a livello mondiale
Applicazioni
Approfondimenti sulle nostre soluzioni alle sfide applicative più difficili
Virtualizzazione
Il futuro della virtualizzazione negli ambienti aziendali per i carichi di lavoro on premise o nel cloud