The transition from AI experimentation to production-grade deployment is often the most difficult hurdle for an enterprise. At Red Hat, we believe that choosing a model should come with predictable outcomes, rather than uncertainty. Our third-party model validation initiative is designed to remove the guesswork, providing the guidance and predictability organizations need to scale their AI infrastructure effectively.
The January and February 2026 batches of validated models are now available on the Red Hat AI Hugging Face page, coinciding with the Red Hat AI 3.3 release. These model releases introduce frontier-class reasoning and multimodal capabilities, packaged for simple, high-performance deployment on the Red Hat AI platform.
Beyond the benchmarks: Validation as operational guidance
While public leaderboards provide a snapshot of a model's intelligence, they rarely tell you how that model will perform on specific hardware or within your production constraints. Think of our validation process like a safety rating for industrial equipment: it helps verify that the tool, or the model in our case, is powerful, reliable, and fit for its environment. Red Hat AI model validation provides precision guidance for capacity planning and reliability, rather than a generic performance guarantee.
- Established baselines: Using GuideLLM, we provide resource requirements and performance profiles across diverse hardware configurations, so you can right-size your infrastructure.
- Integrity verification: Using lm-eval-harness, we help verify that optimizations, such as FP8 and NVFP4 quantization, preserve the model's accuracy. This allows you to gain efficiency without compromising quality.
- Standardized deployment: Every model is packaged as a ModelCar, a specialized container format that treats AI models as standard OCI artifacts. This creates a reproducible, security-scanned, and version-controlled asset that is ready for high-throughput serving via vLLM or llm-d.
Solving the “vetting bottleneck” with a secure model supply chain
We help organizations accelerate deployment by automating security-by-design. Shifting security left in the AI lifecycle helps ensure models meet enterprise standards before they reach production.
- Vulnerability scanning: ModelCars undergo a basic vulnerability scan as a core step in our containerization pipeline.
- Signing: Our models and ModelCars are signed via the build pipeline using Sigstore and the Red Hat Trusted Artifact Signer; these signatures and attestations are currently hosted in the registry to support end-to-end integrity and authenticity.
- Tamper protection: By taking advantage of the Hugging Face standard, SafeTensors, we neutralize "model-as-code" (pickle-based) threats, giving security and compliance teams the confidence to move assets into production faster.
Stay tuned for more security-related improvements to Red Hat AI validated models coming soon.
January release: High-scale reasoning & NVFP4 innovation
The January release marked a technical milestone for Red Hat, underscoring our expanded collaboration with NVIDIA. A highlight from January was the release of an NVFP4 (NVIDIA 4-bit Floating Point) validated model, specifically optimized for the NVIDIA Blackwell architecture. This release also included a batch of compressed models, validated to maximize efficiency on your existing GPU infrastructure.
- Apertus-8B-Instruct-2509-FP8-dynamic: A breakthrough in transparent, compliant AI, this model was designed for regulated environments. It excels in multilingual tasks, supporting over 1,000 languages.
- Mistral-Large-3-675B-Instruct-2512 (natively FP8): The “heavy-hitter” for complex reasoning and enterprise-grade, multilingual tasks, it features a massive 256k context window.
- Mistral-Large-3-675B-NVFP4: By taking advantage of NVIDIA’s latest 4-bit floating point quantization, this version brings the power of Mistral Large to more accessible hardware configurations. It drastically reduces the VRAM required for deployment.
- NVIDIA-Nemotron-3-Nano-30B-A3B-FP8: A hybrid Mixture-of-Experts (MoE) model built for efficiency. It serves as a workhorse for AI agent systems and Retrieval-Augmented Generation (RAG), offering 128K context support and optimized reasoning traces.
February release: Vision, logic, and hybrid architectures
Our February batch focuses on specialized capabilities, from deep mathematical reasoning to multimodal "Vision-to-Action" architectures.
- Granite-4.0-h-small-FP8-dynamic: IBM's newest hybrid Mamba-2/Transformer architecture. It delivers a 70% reduction in memory usage for long-context RAG and multitool agent workflows.
- Granite-4.0-h-tiny-FP8-dynamic: The ultra-lightweight counterpart to the "small" variant. This model is designed for extreme efficiency at the edge or as a high-speed classifier in agentic pipelines, providing the same hybrid architectural benefits in a minimal footprint.
- Ministral-3-14B-Instruct-2512 (natively FP8): A "premier small model" with a vision encoder. It offers frontier-level performance for local chatbot and agentic applications in a footprint that fits easily on lower-VRAM hardware.
- Phi-4-reasoning-FP8-dynamic: Microsoft’s latest logic-heavy model. Validated to provide top-tier performance in math and code-related tasks while maintaining a compact, edge-ready size.
- Qwen3-VL-235B-A22B-Instruct-NVFP4: Qwen’s premier Vision-Language (VL) model. It handles complex document parsing, spatial reasoning, and GUI automation, optimized via NVFP4 for scalable multimodal serving.
- Qwen3-Next-80B-A3B-Instruct-quantized.w4a16: A frontier-class MoE model that delivers the reasoning depth of an 80B architecture with the speed of only 3B active parameters per token. Validated in a weight-only 4-bit (w4a16) format, it’s specifically designed for enterprise applications where low-latency responses and complex instruction-following are critical.
Ready to get started?
Explore the full performance data and accuracy benchmarks on our Red Hat AI Hugging Face page or pull the latest ModelCar images directly from the Red Hat Container Registry.
Risorsa
L'adattabilità enterprise: predisporsi all'IA per essere pronti a un'innovazione radicale
Sull'autore
My name is Rob Greenberg, Principal Product Manager for Red Hat AI, and I came over to Red Hat with the Neural Magic acquisition in January 2025. Prior to joining Red Hat, I spent 3 years at Neural Magic building and delivering tools that accelerate AI inference with optimized, open-source models. I've also had stints as a Digital Product Manager at Rocketbook and as a Technology Consultant at Accenture.
Altri risultati simili a questo
Enable intelligent insights with Red Hat Satellite MCP Server
AI quickstart: Protecting inference with F5 Distributed Cloud and Red Hat AI
Technically Speaking | Build a production-ready AI toolbox
Technically Speaking | Platform engineering for AI agents
Ricerca per canale
Automazione
Novità sull'automazione IT di tecnologie, team e ambienti
Intelligenza artificiale
Aggiornamenti sulle piattaforme che consentono alle aziende di eseguire carichi di lavoro IA ovunque
Hybrid cloud open source
Scopri come affrontare il futuro in modo più agile grazie al cloud ibrido
Sicurezza
Le ultime novità sulle nostre soluzioni per ridurre i rischi nelle tecnologie e negli ambienti
Edge computing
Aggiornamenti sulle piattaforme che semplificano l'operatività edge
Infrastruttura
Le ultime novità sulla piattaforma Linux aziendale leader a livello mondiale
Applicazioni
Approfondimenti sulle nostre soluzioni alle sfide applicative più difficili
Virtualizzazione
Il futuro della virtualizzazione negli ambienti aziendali per i carichi di lavoro on premise o nel cloud