The transition from AI experimentation to production-grade deployment is often the most difficult hurdle for an enterprise. At Red Hat, we believe that choosing a model should come with predictable outcomes, rather than uncertainty. Our third-party model validation initiative is designed to remove the guesswork, providing the guidance and predictability organizations need to scale their AI infrastructure effectively.
The January and February 2026 batches of validated models are now available on the Red Hat AI Hugging Face page, coinciding with the Red Hat AI 3.3 release. These model releases introduce frontier-class reasoning and multimodal capabilities, packaged for simple, high-performance deployment on the Red Hat AI platform.
Beyond the benchmarks: Validation as operational guidance
While public leaderboards provide a snapshot of a model's intelligence, they rarely tell you how that model will perform on specific hardware or within your production constraints. Think of our validation process like a safety rating for industrial equipment: it helps verify that the tool, or the model in our case, is powerful, reliable, and fit for its environment. Red Hat AI model validation provides precision guidance for capacity planning and reliability, rather than a generic performance guarantee.
- Established baselines: Using GuideLLM, we provide resource requirements and performance profiles across diverse hardware configurations, so you can right-size your infrastructure.
- Integrity verification: Using lm-eval-harness, we help verify that optimizations, such as FP8 and NVFP4 quantization, preserve the model's accuracy. This allows you to gain efficiency without compromising quality.
- Standardized deployment: Every model is packaged as a ModelCar, a specialized container format that treats AI models as standard OCI artifacts. This creates a reproducible, security-scanned, and version-controlled asset that is ready for high-throughput serving via vLLM or llm-d.
Solving the “vetting bottleneck” with a secure model supply chain
We help organizations accelerate deployment by automating security-by-design. Shifting security left in the AI lifecycle helps ensure models meet enterprise standards before they reach production.
- Vulnerability scanning: ModelCars undergo a basic vulnerability scan as a core step in our containerization pipeline.
- Signing: Our models and ModelCars are signed via the build pipeline using Sigstore and the Red Hat Trusted Artifact Signer; these signatures and attestations are currently hosted in the registry to support end-to-end integrity and authenticity.
- Tamper protection: By taking advantage of the Hugging Face standard, SafeTensors, we neutralize "model-as-code" (pickle-based) threats, giving security and compliance teams the confidence to move assets into production faster.
Stay tuned for more security-related improvements to Red Hat AI validated models coming soon.
January release: High-scale reasoning & NVFP4 innovation
The January release marked a technical milestone for Red Hat, underscoring our expanded collaboration with NVIDIA. A highlight from January was the release of an NVFP4 (NVIDIA 4-bit Floating Point) validated model, specifically optimized for the NVIDIA Blackwell architecture. This release also included a batch of compressed models, validated to maximize efficiency on your existing GPU infrastructure.
- Apertus-8B-Instruct-2509-FP8-dynamic: A breakthrough in transparent, compliant AI, this model was designed for regulated environments. It excels in multilingual tasks, supporting over 1,000 languages.
- Mistral-Large-3-675B-Instruct-2512 (natively FP8): The “heavy-hitter” for complex reasoning and enterprise-grade, multilingual tasks, it features a massive 256k context window.
- Mistral-Large-3-675B-NVFP4: By taking advantage of NVIDIA’s latest 4-bit floating point quantization, this version brings the power of Mistral Large to more accessible hardware configurations. It drastically reduces the VRAM required for deployment.
- NVIDIA-Nemotron-3-Nano-30B-A3B-FP8: A hybrid Mixture-of-Experts (MoE) model built for efficiency. It serves as a workhorse for AI agent systems and Retrieval-Augmented Generation (RAG), offering 128K context support and optimized reasoning traces.
February release: Vision, logic, and hybrid architectures
Our February batch focuses on specialized capabilities, from deep mathematical reasoning to multimodal "Vision-to-Action" architectures.
- Granite-4.0-h-small-FP8-dynamic: IBM's newest hybrid Mamba-2/Transformer architecture. It delivers a 70% reduction in memory usage for long-context RAG and multitool agent workflows.
- Granite-4.0-h-tiny-FP8-dynamic: The ultra-lightweight counterpart to the "small" variant. This model is designed for extreme efficiency at the edge or as a high-speed classifier in agentic pipelines, providing the same hybrid architectural benefits in a minimal footprint.
- Ministral-3-14B-Instruct-2512 (natively FP8): A "premier small model" with a vision encoder. It offers frontier-level performance for local chatbot and agentic applications in a footprint that fits easily on lower-VRAM hardware.
- Phi-4-reasoning-FP8-dynamic: Microsoft’s latest logic-heavy model. Validated to provide top-tier performance in math and code-related tasks while maintaining a compact, edge-ready size.
- Qwen3-VL-235B-A22B-Instruct-NVFP4: Qwen’s premier Vision-Language (VL) model. It handles complex document parsing, spatial reasoning, and GUI automation, optimized via NVFP4 for scalable multimodal serving.
- Qwen3-Next-80B-A3B-Instruct-quantized.w4a16: A frontier-class MoE model that delivers the reasoning depth of an 80B architecture with the speed of only 3B active parameters per token. Validated in a weight-only 4-bit (w4a16) format, it’s specifically designed for enterprise applications where low-latency responses and complex instruction-following are critical.
Ready to get started?
Explore the full performance data and accuracy benchmarks on our Red Hat AI Hugging Face page or pull the latest ModelCar images directly from the Red Hat Container Registry.
Recurso
La empresa adaptable: Motivos por los que la preparación para la inteligencia artificial implica prepararse para los cambios drásticos
Sobre el autor
My name is Rob Greenberg, Principal Product Manager for Red Hat AI, and I came over to Red Hat with the Neural Magic acquisition in January 2025. Prior to joining Red Hat, I spent 3 years at Neural Magic building and delivering tools that accelerate AI inference with optimized, open-source models. I've also had stints as a Digital Product Manager at Rocketbook and as a Technology Consultant at Accenture.
Más como éste
From experiment to production: A reliable architecture for version-controlled MLOps
AI security: Defending against prompt injection and unsafe actions
Technically Speaking | Build a production-ready AI toolbox
Technically Speaking | Platform engineering for AI agents
Navegar por canal
Automatización
Las últimas novedades en la automatización de la TI para los equipos, la tecnología y los entornos
Inteligencia artificial
Descubra las actualizaciones en las plataformas que permiten a los clientes ejecutar cargas de trabajo de inteligecia artificial en cualquier lugar
Nube híbrida abierta
Vea como construimos un futuro flexible con la nube híbrida
Seguridad
Vea las últimas novedades sobre cómo reducimos los riesgos en entornos y tecnologías
Edge computing
Conozca las actualizaciones en las plataformas que simplifican las operaciones en el edge
Infraestructura
Vea las últimas novedades sobre la plataforma Linux empresarial líder en el mundo
Aplicaciones
Conozca nuestras soluciones para abordar los desafíos más complejos de las aplicaciones
Virtualización
El futuro de la virtualización empresarial para tus cargas de trabajo locales o en la nube