You've picked a model. Maybe it's a 70 billion parameter large model because someone on the team saw it top a leaderboard. Now you need it running in production on your Red Hat OpenShift AI cluster. So you start tuning batch sizes, figuring out quantization, sizing GPU requests, writing Kubernetes manifests, and hoping the out of memory errors stop before your deadline hits.
We've watched this play out enough times to see the pattern. The hard part of enterprise AI isn't just picking a model, it's the stretch between "this model looks good" and "this model is serving traffic reliably." That stretch eats weeks, sometimes months. Project Navigator is part of our strategy to make this process shorter.
The operational bottleneck
The AI industry has a model problem, but probably not the one you're thinking of. There are thousands of AI models available today. The tooling to train and fine-tune them keeps getting better. But getting a model deployed well, on hardware that fits, with a configuration that doesn't waste money? Typically, that still involves a lot of manual work.
GPU utilization at most organizations is well below what the hardware can do. Cost overruns on AI infrastructure are common. The people who know how to size and tune these systems are hard to find.
The pain shows up differently depending on your role. If you're an AI engineer, you spend weeks benchmarking models and debugging memory errors before a single request gets served. If you're a platform architect, you're watching expensive GPUs sit half-idle because nobody sized the workloads properly. And if you're the product owner, you can't tell whether a deployment will be cost effective until it's already running and the invoices start arriving.
What is Project Navigator?
Project Navigator is a layer on top of OpenShift AI that connects to your cluster, sees what's running and what's available, and helps you make better choices about model selection and deployment. You talk to it in plain language. Tell it what you're building ("I want a retrieval-augmented generation app for our internal knowledge base for 20 concurrent users with a max of 1.5s latency"), and it works with your cluster's actual state—the models in your catalog, the hardware you have, the benchmarks that matter for your use case.
Project Navigator is currently available in OpenShift AI 3.4 as a developer preview with 2 capabilities.
1. Intelligent model selection
Teams routinely pick models based on name recognition or leaderboard rankings without checking whether a smaller model can handle their task as or more effectively. Navigator changes this by matching what you describe against a collection of benchmark data (MMLU, HumanEval, and others) and recommending the best fit from your own model catalog.
The results can be surprising. For a code-heavy workload, an 8 billion parameter model may outperform a general purpose 70 billion parameter model on the benchmarks that actually matter, while using a fraction of the GPU resources. Navigator puts those comparisons in front of you so you pick based on evidence, not habit. It offers cost centric, performance centric, and blended recommendations.
2. Optimized model deployment
Once you've selected a model, Navigator generates Kubernetes manifests tailored to your cluster. That means a KServe InferenceService spec with resource requests and limits, a HorizontalPodAutoscaler with scaling rules, and a ServiceMonitor for Prometheus-based observability. All sized against what your cluster actually has, not generic defaults.
The gap between a properly configured deployment and a default one is large, both in hardware utilization and cost. Navigator closes it without requiring your team to become infrastructure specialists.
How it's built
Navigator is built on an open integration standard called the Model Context Protocol (MCP). This means it can meet teams where they work. If your engineers already use tools like Claude Code, Cursor, or Gemini CLI, they can access Navigator's capabilities directly from those environments. As more tools adopt MCP, Navigator will work with those as well, and as we bring Navigator's capabilities into Red Hat OpenShift AI, teams will have even more ways to access them.
The project is open source, developed upstream in the Open Data Hub community under the name RHOAI MCP Server.
Get involved
Project Navigator is in developer preview. We shipped it early because we want feedback from teams dealing with these problems right now.
Grab the code and try it: RHOAI MCP Server (Project Navigator)
If you're running OpenShift AI and spending too much time on model selection or deployment tuning, we'd like to hear from you. File issues, open pull requests, or tell us what's broken. We're building this in the open for a reason.
저자 소개
Suhas Kashyap is a Product Manager on the Red Hat OpenShift AI team, where he focuses on AI/ML platform capabilities including model customization, RAG, and developer tooling. He brings over 22 years of software industry experience spanning development, architecture, and DevOps.
Before joining Red Hat, Suhas spent 9.5 years at IBM in AI Product Management, where he shipped the AI Toolkit for IBM Z and LinuxONE and worked extensively on model customization and advanced RAG capabilities within watsonx.ai.
Outside of work, Suhas is an avid cricketer, half-marathon runner, amateur photographer, and self-described lawn care enthusiast.
Amit Oren is a Principal Engineer at Red Hat, where he works on OpenShift AI and serves as a maintainer of the llm-d Planner project. His work focuses on building scalable AI platforms, inference systems, and cloud-native technologies that help bring AI into production.
Before joining Red Hat, Amit worked at Amazon Web Services (AWS), where he contributed to large-scale cloud systems. Over the course of his career, he has designed and delivered distributed systems spanning cloud infrastructure, networking, data storage, and software architecture.
He holds a broad technical background in artificial intelligence, Kubernetes and cloud-native platforms, infrastructure as code, operating systems, networking protocols, and distributed systems engineering.`
Paul van Run is an Architect on the Red Hat OpenShift AI team, specializing in observability and AI tooling. He brings over 30 years of software industry experience, including Director and CTO roles at startup companies and extensive work at IBM and Red Hat in software development, cloud, and AI technologies.
Outside of work, Paul enjoys playing goalkeeper in soccer, woodworking and laser cutting, furniture restoration, and hands-on home renovation projects.
유사한 검색 결과
에이전틱 AI가 요구하는 새로운 인프라 스택: AMD와 Red Hat의 솔루션 제공
과거의 운영 방식에서 벗어나 IT의 미래 구축
Technically Speaking | Inside open source AI strategy
Technically Speaking | Build a production-ready AI toolbox
채널별 검색
오토메이션
기술, 팀, 인프라를 위한 IT 자동화 최신 동향
인공지능
고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트
오픈 하이브리드 클라우드
하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요
보안
환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보
엣지 컴퓨팅
엣지에서의 운영을 단순화하는 플랫폼 업데이트
인프라
세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보
애플리케이션
복잡한 애플리케이션에 대한 솔루션 더 보기
가상화
온프레미스와 클라우드 환경에서 워크로드를 유연하게 운영하기 위한 엔터프라이즈 가상화의 미래