Features | Benefits |
Model development and customization | Accelerate AI development using self-service notebooks and integrated development environments (IDEs) preloaded with curated AI/ML libraries. Speed up model development by integrating data ingestion, synthetic data, InstructLab, and retrieval-augmented generation (RAG). AutoRAG and AutoML (previews) automate optimization so teams can focus on more critical projects. |
Model training and experimentation | Cut training time and cost by running distributed workloads across GPU clusters with intelligent hardware allocation and experiment tracking. Versioned artifacts and reproducible workflows keep teams aligned, eliminating repeated work. |
Intelligent GPU and hardware speed | Maximize GPU use and control costs with intelligent workload scheduling, quota enforcement, and priority-based access across NVIDIA, AMD, and other accelerator hardware. Hardware profiles give platform teams real-time visibility into GPU consumption while allowing data scientists to provision accelerators on demand, without requiring operational intervention. |
AI pipelines | Eliminate manual handoffs and reduce human error with automated, versioned AI pipelines. Each tracked run lets teams reproduce, audit, and optimize workflows from experimentation to production without relying on institutional knowledge. |
Optimized model serving | Serve large language models (LLMs) at production scale with high throughput and low latency using vLLM, and deploy predictive ML models using out-of-the-box and custom runtime servers. Achieve cost-efficient distributed inference with the llm-d framework for predictable, scalable performance. Reduce serving cost through LLM Compressor quantization and use a curated catalog of optimized, validated gen AI models to accelerate time to production. |
Agentic AI and gen AI user interfaces (UIs) | Speed agentic AI workflows with expanding focus on Agent Ops and connect agents to core platform services. The platform delivers a unified application processing interface (API) layer, model context protocol (MCP), agentic APIs (e.g., the Open Responses API), and a dedicated dashboard experience (AI hub and gen AI studio). MLflow integration provides end-to-end agent traceability and observability, logging LLM calls and tool use for comprehensive visibility. |
Model observability and governance | Monitor model health by continuously tracking performance, data drift, and bias in real time, allowing proactive intervention before quality issues reach users. Pair runtime guardrails with LM Eval and GuideLLM benchmarking to validate models against real-world inference conditions and capture audit trails through MLflow for compliance evidence for governance and regulatory requirements. |
Evaluation | Prevent costly production failures with EvalHub (preview), a unified evaluation control plane to scientifically benchmark, score, and assess models, RAG pipelines, and AI agents before and during deployment. Built-in domain-specific evaluation collections replace ad-hoc manual testing with reproducible, standardized evaluation suites. |
Catalog and registry | Govern AI assets from a central registry including predictive and gen AI models, MCP servers, metadata, and deployment artifacts. A curated ecosystem of validated models reduces onboarding time while metadata management ensures traceability and compliance across hybrid cloud deployments. |
Feature store | Reduce data preparation time with a centralized feature store providing consistent, reusable sets. Shared definitions eliminate redundant feature engineering and training-serving skew, accelerating production-ready models. |
Models-as-a-service | Provides AI engineers with self-service API access to approved models via a managed, built-in gateway. Use tracking gives administrators visibility into consumption patterns for showback, quota enforcement, and cost accountability. |
AI safety and security | Catch common AI attacks such as jailbreaks, prompt injections, and toxic outputs before production with automated adversarial vulnerability scanning powered by Garak and NVIDIA NeMo Guardrails. Synthetic data generation (SDG preview) creates tailored adversarial test datasets, validating guardrails against realistic threat scenarios, and supporting risk documentation required for AI regulations. |
Disconnected environments and edge | Deploy portable AI workloads across disconnected, air-gapped, and edge environments to meet strict data sovereignty and regulatory compliance. |