The transition from AI experimentation to production-grade deployment is often the most difficult hurdle for an enterprise. At Red Hat, we believe that choosing a model should come with predictable outcomes, rather than uncertainty. Our third-party model validation initiative is designed to remove the guesswork, providing the guidance and predictability organizations need to scale their AI infrastructure effectively.

The January and February 2026 batches of validated models are now available on the Red Hat AI Hugging Face page, coinciding with the Red Hat AI 3.3 release. These model releases introduce frontier-class reasoning and multimodal capabilities, packaged for simple, high-performance deployment on the Red Hat AI platform. 

Beyond the benchmarks: Validation as operational guidance

While public leaderboards provide a snapshot of a model's intelligence, they rarely tell you how that model will perform on specific hardware or within your production constraints. Think of our validation process like a safety rating for industrial equipment: it helps verify that the tool, or the model in our case, is powerful, reliable, and fit for its environment. Red Hat AI model validation provides precision guidance for capacity planning and reliability, rather than a generic performance guarantee.

  • Established baselines: Using GuideLLM, we provide resource requirements and performance profiles across diverse hardware configurations, so you can right-size your infrastructure.
  • Integrity verification: Using lm-eval-harness, we help verify that optimizations, such as FP8 and NVFP4 quantization, preserve the model's accuracy. This allows you to gain efficiency without compromising quality.
  • Standardized deployment: Every model is packaged as a ModelCar, a specialized container format that treats AI models as standard OCI artifacts. This creates a reproducible, security-scanned, and version-controlled asset that is ready for high-throughput serving via vLLM or llm-d.

Solving the “vetting bottleneck” with a secure model supply chain 

We help organizations accelerate deployment by automating security-by-design. Shifting security left in the AI lifecycle helps ensure models meet enterprise standards before they reach production.

  • Vulnerability scanning: ModelCars undergo a basic vulnerability scan as a core step in our containerization pipeline.
  • Signing: Our models and ModelCars are signed via the build pipeline using Sigstore and the Red Hat Trusted Artifact Signer; these signatures and attestations are currently hosted in the registry to support end-to-end integrity and authenticity.
  • Tamper protection: By taking advantage of the Hugging Face standard, SafeTensors, we neutralize "model-as-code" (pickle-based) threats, giving security and compliance teams the confidence to move assets into production faster.

Stay tuned for more security-related improvements to Red Hat AI validated models coming soon. 

January release: High-scale reasoning & NVFP4 innovation

The January release marked a technical milestone for Red Hat, underscoring our expanded collaboration with NVIDIA. A highlight from January was the release of an NVFP4 (NVIDIA 4-bit Floating Point) validated model, specifically optimized for the NVIDIA Blackwell architecture. This release also included a batch of compressed models, validated to maximize efficiency on your existing GPU infrastructure. 

  • Apertus-8B-Instruct-2509-FP8-dynamic: A breakthrough in transparent, compliant AI, this model was designed for regulated environments. It excels in multilingual tasks, supporting over 1,000 languages.
  • Mistral-Large-3-675B-Instruct-2512 (natively FP8): The “heavy-hitter” for complex reasoning and enterprise-grade, multilingual tasks, it features a massive 256k context window.
  • Mistral-Large-3-675B-NVFP4: By taking advantage of NVIDIA’s latest 4-bit floating point quantization, this version brings the power of Mistral Large to more accessible hardware configurations. It drastically reduces the VRAM required for deployment.
  • NVIDIA-Nemotron-3-Nano-30B-A3B-FP8: A hybrid Mixture-of-Experts (MoE) model built for efficiency. It serves as a workhorse for AI agent systems and Retrieval-Augmented Generation (RAG), offering 128K context support and optimized reasoning traces.

February release: Vision, logic, and hybrid architectures

Our February batch focuses on specialized capabilities, from deep mathematical reasoning to multimodal "Vision-to-Action" architectures.

  • Granite-4.0-h-small-FP8-dynamic: IBM's newest hybrid Mamba-2/Transformer architecture. It delivers a 70% reduction in memory usage for long-context RAG and multitool agent workflows.
  • Granite-4.0-h-tiny-FP8-dynamic: The ultra-lightweight counterpart to the "small" variant. This model is designed for extreme efficiency at the edge or as a high-speed classifier in agentic pipelines, providing the same hybrid architectural benefits in a minimal footprint.
  • Ministral-3-14B-Instruct-2512 (natively FP8): A "premier small model" with a vision encoder. It offers frontier-level performance for local chatbot and agentic applications in a footprint that fits easily on lower-VRAM hardware.
  • Phi-4-reasoning-FP8-dynamic: Microsoft’s latest logic-heavy model. Validated to provide top-tier performance in math and code-related tasks while maintaining a compact, edge-ready size.
  • Qwen3-VL-235B-A22B-Instruct-NVFP4: Qwen’s premier Vision-Language (VL) model. It handles complex document parsing, spatial reasoning, and GUI automation, optimized via NVFP4 for scalable multimodal serving.
  • Qwen3-Next-80B-A3B-Instruct-quantized.w4a16: A frontier-class MoE model that delivers the reasoning depth of an 80B architecture with the speed of only 3B active parameters per token. Validated in a weight-only 4-bit (w4a16) format, it’s specifically designed for enterprise applications where low-latency responses and complex instruction-following are critical.

Ready to get started?

Explore the full performance data and accuracy benchmarks on our Red Hat AI Hugging Face page or pull the latest ModelCar images directly from the Red Hat Container Registry.

리소스

적응형 엔터프라이즈: AI 준비성은 곧 위기 대응력

Red Hat의 COO 겸 CSO인 Michael Ferris가 쓴 이 e-Book은 오늘날 IT 리더들이 직면한 AI의 변화와 기술적 위기의 속도를 살펴봅니다.

저자 소개

My name is Rob Greenberg, Principal Product Manager for Red Hat AI, and I came over to Red Hat with the Neural Magic acquisition in January 2025. Prior to joining Red Hat, I spent 3 years at Neural Magic building and delivering tools that accelerate AI inference with optimized, open-source models. I've also had stints as a Digital Product Manager at Rocketbook and as a Technology Consultant at Accenture.

UI_Icon-Red_Hat-Close-A-Black-RGB

채널별 검색

automation icon

오토메이션

기술, 팀, 인프라를 위한 IT 자동화 최신 동향

AI icon

인공지능

고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트

open hybrid cloud icon

오픈 하이브리드 클라우드

하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요

security icon

보안

환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보

edge icon

엣지 컴퓨팅

엣지에서의 운영을 단순화하는 플랫폼 업데이트

Infrastructure icon

인프라

세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보

application development icon

애플리케이션

복잡한 애플리케이션에 대한 솔루션 더 보기

Virtualization icon

가상화

온프레미스와 클라우드 환경에서 워크로드를 유연하게 운영하기 위한 엔터프라이즈 가상화의 미래