The transition from AI experimentation to production-grade deployment is often the most difficult hurdle for an enterprise. At Red Hat, we believe that choosing a model should come with predictable outcomes, rather than uncertainty. Our third-party model validation initiative is designed to remove the guesswork, providing the guidance and predictability organizations need to scale their AI infrastructure effectively.
The January and February 2026 batches of validated models are now available on the Red Hat AI Hugging Face page, coinciding with the Red Hat AI 3.3 release. These model releases introduce frontier-class reasoning and multimodal capabilities, packaged for simple, high-performance deployment on the Red Hat AI platform.
Beyond the benchmarks: Validation as operational guidance
While public leaderboards provide a snapshot of a model's intelligence, they rarely tell you how that model will perform on specific hardware or within your production constraints. Think of our validation process like a safety rating for industrial equipment: it helps verify that the tool, or the model in our case, is powerful, reliable, and fit for its environment. Red Hat AI model validation provides precision guidance for capacity planning and reliability, rather than a generic performance guarantee.
- Established baselines: Using GuideLLM, we provide resource requirements and performance profiles across diverse hardware configurations, so you can right-size your infrastructure.
- Integrity verification: Using lm-eval-harness, we help verify that optimizations, such as FP8 and NVFP4 quantization, preserve the model's accuracy. This allows you to gain efficiency without compromising quality.
- Standardized deployment: Every model is packaged as a ModelCar, a specialized container format that treats AI models as standard OCI artifacts. This creates a reproducible, security-scanned, and version-controlled asset that is ready for high-throughput serving via vLLM or llm-d.
Solving the “vetting bottleneck” with a secure model supply chain
We help organizations accelerate deployment by automating security-by-design. Shifting security left in the AI lifecycle helps ensure models meet enterprise standards before they reach production.
- Vulnerability scanning: ModelCars undergo a basic vulnerability scan as a core step in our containerization pipeline.
- Signing: Our models and ModelCars are signed via the build pipeline using Sigstore and the Red Hat Trusted Artifact Signer; these signatures and attestations are currently hosted in the registry to support end-to-end integrity and authenticity.
- Tamper protection: By taking advantage of the Hugging Face standard, SafeTensors, we neutralize "model-as-code" (pickle-based) threats, giving security and compliance teams the confidence to move assets into production faster.
Stay tuned for more security-related improvements to Red Hat AI validated models coming soon.
January release: High-scale reasoning & NVFP4 innovation
The January release marked a technical milestone for Red Hat, underscoring our expanded collaboration with NVIDIA. A highlight from January was the release of an NVFP4 (NVIDIA 4-bit Floating Point) validated model, specifically optimized for the NVIDIA Blackwell architecture. This release also included a batch of compressed models, validated to maximize efficiency on your existing GPU infrastructure.
- Apertus-8B-Instruct-2509-FP8-dynamic: A breakthrough in transparent, compliant AI, this model was designed for regulated environments. It excels in multilingual tasks, supporting over 1,000 languages.
- Mistral-Large-3-675B-Instruct-2512 (natively FP8): The “heavy-hitter” for complex reasoning and enterprise-grade, multilingual tasks, it features a massive 256k context window.
- Mistral-Large-3-675B-NVFP4: By taking advantage of NVIDIA’s latest 4-bit floating point quantization, this version brings the power of Mistral Large to more accessible hardware configurations. It drastically reduces the VRAM required for deployment.
- NVIDIA-Nemotron-3-Nano-30B-A3B-FP8: A hybrid Mixture-of-Experts (MoE) model built for efficiency. It serves as a workhorse for AI agent systems and Retrieval-Augmented Generation (RAG), offering 128K context support and optimized reasoning traces.
February release: Vision, logic, and hybrid architectures
Our February batch focuses on specialized capabilities, from deep mathematical reasoning to multimodal "Vision-to-Action" architectures.
- Granite-4.0-h-small-FP8-dynamic: IBM's newest hybrid Mamba-2/Transformer architecture. It delivers a 70% reduction in memory usage for long-context RAG and multitool agent workflows.
- Granite-4.0-h-tiny-FP8-dynamic: The ultra-lightweight counterpart to the "small" variant. This model is designed for extreme efficiency at the edge or as a high-speed classifier in agentic pipelines, providing the same hybrid architectural benefits in a minimal footprint.
- Ministral-3-14B-Instruct-2512 (natively FP8): A "premier small model" with a vision encoder. It offers frontier-level performance for local chatbot and agentic applications in a footprint that fits easily on lower-VRAM hardware.
- Phi-4-reasoning-FP8-dynamic: Microsoft’s latest logic-heavy model. Validated to provide top-tier performance in math and code-related tasks while maintaining a compact, edge-ready size.
- Qwen3-VL-235B-A22B-Instruct-NVFP4: Qwen’s premier Vision-Language (VL) model. It handles complex document parsing, spatial reasoning, and GUI automation, optimized via NVFP4 for scalable multimodal serving.
- Qwen3-Next-80B-A3B-Instruct-quantized.w4a16: A frontier-class MoE model that delivers the reasoning depth of an 80B architecture with the speed of only 3B active parameters per token. Validated in a weight-only 4-bit (w4a16) format, it’s specifically designed for enterprise applications where low-latency responses and complex instruction-following are critical.
Ready to get started?
Explore the full performance data and accuracy benchmarks on our Red Hat AI Hugging Face page or pull the latest ModelCar images directly from the Red Hat Container Registry.
Ressource
Das adaptive Unternehmen: KI-Bereitschaft heißt Disruptionsbereitschaft
Über den Autor
My name is Rob Greenberg, Principal Product Manager for Red Hat AI, and I came over to Red Hat with the Neural Magic acquisition in January 2025. Prior to joining Red Hat, I spent 3 years at Neural Magic building and delivering tools that accelerate AI inference with optimized, open-source models. I've also had stints as a Digital Product Manager at Rocketbook and as a Technology Consultant at Accenture.
Ähnliche Einträge
Using containers to bring software engineering rigor to AI workloads
From network telemetry to operational intelligence
Technically Speaking | Build a production-ready AI toolbox
Technically Speaking | Platform engineering for AI agents
Nach Thema durchsuchen
Automatisierung
Das Neueste zum Thema IT-Automatisierung für Technologien, Teams und Umgebungen
Künstliche Intelligenz
Erfahren Sie das Neueste von den Plattformen, die es Kunden ermöglichen, KI-Workloads beliebig auszuführen
Open Hybrid Cloud
Erfahren Sie, wie wir eine flexiblere Zukunft mit Hybrid Clouds schaffen.
Sicherheit
Erfahren Sie, wie wir Risiken in verschiedenen Umgebungen und Technologien reduzieren
Edge Computing
Erfahren Sie das Neueste von den Plattformen, die die Operations am Edge vereinfachen
Infrastruktur
Erfahren Sie das Neueste von der weltweit führenden Linux-Plattform für Unternehmen
Anwendungen
Entdecken Sie unsere Lösungen für komplexe Herausforderungen bei Anwendungen
Virtualisierung
Erfahren Sie das Neueste über die Virtualisierung von Workloads in Cloud- oder On-Premise-Umgebungen