The transition from AI experimentation to production-grade deployment is often the most difficult hurdle for an enterprise. At Red Hat, we believe that choosing a model should come with predictable outcomes, rather than uncertainty. Our third-party model validation initiative is designed to remove the guesswork, providing the guidance and predictability organizations need to scale their AI infrastructure effectively.
The January and February 2026 batches of validated models are now available on the Red Hat AI Hugging Face page, coinciding with the Red Hat AI 3.3 release. These model releases introduce frontier-class reasoning and multimodal capabilities, packaged for simple, high-performance deployment on the Red Hat AI platform.
Beyond the benchmarks: Validation as operational guidance
While public leaderboards provide a snapshot of a model's intelligence, they rarely tell you how that model will perform on specific hardware or within your production constraints. Think of our validation process like a safety rating for industrial equipment: it helps verify that the tool, or the model in our case, is powerful, reliable, and fit for its environment. Red Hat AI model validation provides precision guidance for capacity planning and reliability, rather than a generic performance guarantee.
- Established baselines: Using GuideLLM, we provide resource requirements and performance profiles across diverse hardware configurations, so you can right-size your infrastructure.
- Integrity verification: Using lm-eval-harness, we help verify that optimizations, such as FP8 and NVFP4 quantization, preserve the model's accuracy. This allows you to gain efficiency without compromising quality.
- Standardized deployment: Every model is packaged as a ModelCar, a specialized container format that treats AI models as standard OCI artifacts. This creates a reproducible, security-scanned, and version-controlled asset that is ready for high-throughput serving via vLLM or llm-d.
Solving the “vetting bottleneck” with a secure model supply chain
We help organizations accelerate deployment by automating security-by-design. Shifting security left in the AI lifecycle helps ensure models meet enterprise standards before they reach production.
- Vulnerability scanning: ModelCars undergo a basic vulnerability scan as a core step in our containerization pipeline.
- Signing: Our models and ModelCars are signed via the build pipeline using Sigstore and the Red Hat Trusted Artifact Signer; these signatures and attestations are currently hosted in the registry to support end-to-end integrity and authenticity.
- Tamper protection: By taking advantage of the Hugging Face standard, SafeTensors, we neutralize "model-as-code" (pickle-based) threats, giving security and compliance teams the confidence to move assets into production faster.
Stay tuned for more security-related improvements to Red Hat AI validated models coming soon.
January release: High-scale reasoning & NVFP4 innovation
The January release marked a technical milestone for Red Hat, underscoring our expanded collaboration with NVIDIA. A highlight from January was the release of an NVFP4 (NVIDIA 4-bit Floating Point) validated model, specifically optimized for the NVIDIA Blackwell architecture. This release also included a batch of compressed models, validated to maximize efficiency on your existing GPU infrastructure.
- Apertus-8B-Instruct-2509-FP8-dynamic: A breakthrough in transparent, compliant AI, this model was designed for regulated environments. It excels in multilingual tasks, supporting over 1,000 languages.
- Mistral-Large-3-675B-Instruct-2512 (natively FP8): The “heavy-hitter” for complex reasoning and enterprise-grade, multilingual tasks, it features a massive 256k context window.
- Mistral-Large-3-675B-NVFP4: By taking advantage of NVIDIA’s latest 4-bit floating point quantization, this version brings the power of Mistral Large to more accessible hardware configurations. It drastically reduces the VRAM required for deployment.
- NVIDIA-Nemotron-3-Nano-30B-A3B-FP8: A hybrid Mixture-of-Experts (MoE) model built for efficiency. It serves as a workhorse for AI agent systems and Retrieval-Augmented Generation (RAG), offering 128K context support and optimized reasoning traces.
February release: Vision, logic, and hybrid architectures
Our February batch focuses on specialized capabilities, from deep mathematical reasoning to multimodal "Vision-to-Action" architectures.
- Granite-4.0-h-small-FP8-dynamic: IBM's newest hybrid Mamba-2/Transformer architecture. It delivers a 70% reduction in memory usage for long-context RAG and multitool agent workflows.
- Granite-4.0-h-tiny-FP8-dynamic: The ultra-lightweight counterpart to the "small" variant. This model is designed for extreme efficiency at the edge or as a high-speed classifier in agentic pipelines, providing the same hybrid architectural benefits in a minimal footprint.
- Ministral-3-14B-Instruct-2512 (natively FP8): A "premier small model" with a vision encoder. It offers frontier-level performance for local chatbot and agentic applications in a footprint that fits easily on lower-VRAM hardware.
- Phi-4-reasoning-FP8-dynamic: Microsoft’s latest logic-heavy model. Validated to provide top-tier performance in math and code-related tasks while maintaining a compact, edge-ready size.
- Qwen3-VL-235B-A22B-Instruct-NVFP4: Qwen’s premier Vision-Language (VL) model. It handles complex document parsing, spatial reasoning, and GUI automation, optimized via NVFP4 for scalable multimodal serving.
- Qwen3-Next-80B-A3B-Instruct-quantized.w4a16: A frontier-class MoE model that delivers the reasoning depth of an 80B architecture with the speed of only 3B active parameters per token. Validated in a weight-only 4-bit (w4a16) format, it’s specifically designed for enterprise applications where low-latency responses and complex instruction-following are critical.
Ready to get started?
Explore the full performance data and accuracy benchmarks on our Red Hat AI Hugging Face page or pull the latest ModelCar images directly from the Red Hat Container Registry.
리소스
적응형 엔터프라이즈: AI 준비성은 곧 위기 대응력
저자 소개
My name is Rob Greenberg, Principal Product Manager for Red Hat AI, and I came over to Red Hat with the Neural Magic acquisition in January 2025. Prior to joining Red Hat, I spent 3 years at Neural Magic building and delivering tools that accelerate AI inference with optimized, open-source models. I've also had stints as a Digital Product Manager at Rocketbook and as a Technology Consultant at Accenture.
채널별 검색
오토메이션
기술, 팀, 인프라를 위한 IT 자동화 최신 동향
인공지능
고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트
오픈 하이브리드 클라우드
하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요
보안
환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보
엣지 컴퓨팅
엣지에서의 운영을 단순화하는 플랫폼 업데이트
인프라
세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보
애플리케이션
복잡한 애플리케이션에 대한 솔루션 더 보기
가상화
온프레미스와 클라우드 환경에서 워크로드를 유연하게 운영하기 위한 엔터프라이즈 가상화의 미래