Why the future of AI depends on a portable, open PyTorch ecosystem

2026년 3월 5일Stephen Watt, Sudhir Dharanendraiah3분 읽기

This blog is an adaptation of our keynote presentation at PyTorch Day India.

In the debate between open source and proprietary technology, open source wins — especially in the AI arena. However, as the generative AI era continues, enterprises face a new version of an old challenge. While the industry is moving at breakneck speed, much of the underlying infrastructure remains fragmented or locked behind proprietary gates. If AI is to be the key to unlocking unprecedented potential, it must be open at every layer—from the datasets and training pipelines to the infrastructure and the serving layers.

At Red Hat, our vision for this open future is clear: any model, any accelerator, any cloud. To make this a reality, we’re betting on open source communities focused on making maximum impact, like the PyTorch community. PyTorch is the engine that drives flexibility, scalability, and—most importantly—accessibility for every enterprise, regardless of their hardware or cloud provider. To ensure that AI innovation remains democratic, we must protect the user's agency to choose the tools and hardware that best fit their specific needs.

Building for agency, democracy, and optionality

Today, much of the AI conversation centers on massive, "frontier" models. While impressive, these models can be unwieldy and expensive to manage, often requiring the latest, most power-hungry GPUs. This creates a barrier to entry that stifles innovation. The path forward is paved by composable intelligence, and to enable this within the community, Red Hat created the vLLM Semantic Router. By acting as an intelligent traffic controller that routes requests based on intent, this tool allows organizations to achieve high-level reasoning using more accessible, efficient infrastructure. This is how we make agentic development a practical reality for the open source community, moving the power of AI away from centralized monoliths and toward a more distributed, collaborative model.

The heart of any AI strategy is the inference server — the layer where the model actually meets the user. To allow for openness and high-performance within the inference layer, Red Hat has become a primary driver of vLLM, one of the most critical projects within the PyTorch Foundation. Our work with vLLM is dedicated to enabling as much portability in the software serving the model as the model itself — done in the spirit of scaling AI and meeting the needs of the global enterprise. What does this look like from a practical standpoint? In short: prioritizing day zero vLLM enablement as soon as models are released, looking beyond single-node performance, and co-founding llm-d. By disaggregating core components like prefill and decode, we’re helping to build a scalable, open standard for model serving that reflects the distributed, hybrid nature of the modern enterprise.

Beyond software, hardware scarcity and cost remain some of the biggest hurdles in AI. Software should never be sacrificed for agency or act as a bottleneck for hardware choice, which is why we champion a future in which PyTorch can run seamlessly across any environment and invest in the development of a PyTorch ecosystem that supports an array of accelerators. This includes:

vllm-cpu: In collaboration with fellow industry leaders, we’re bringing high-performance inference to existing CPUs. This is vital for use cases where cost and power are significant constraints.
OpenReg: This project creates a framework that allows for the rapid enablement of new, power-optimized hardware without requiring complex changes to the PyTorch core.
Advanced kernels: Through projects like Helion and Triton, Red Hat is helping simplify how developers target different accelerators regardless of the underlying silicon.

Together, these investments ensure that the PyTorch community remains the most versatile and hardware-agnostic home for AI development.

Hardening PyTorch for the global enterprise

For AI to move from experimental labs to mission-critical production environments—the kind that run the world’s banks, airlines, and hospitals—it needs industrial-strength reliability. As the third-highest global contributor to the PyTorch project Red Hat’s focus isn't just on new features, it’s on enterprise hardening. Our teams have fixed over 60 issues within Torch.Compile to ensure stability during heavy workloads and have integrated Red Hat Enterprise Linux (RHEL) into the official PyTorch upstream CI. When we contribute to PyTorch, we are doing so to ensure that every organization can deploy AI with the same confidence they have in their operating system.

The principles of access, agency, and freedom allow us to build on each other's ideas and advance technology faster than any single organization could alone. Whether it’s any model, any accelerator, or any cloud, we are committed to building the open, enterprise-ready foundations the world needs to reach its full potential.

저자 소개

Stephen Watt

Distinguished Engineer & Vice President, Office of the CTO

Steve Watt is a Distinguished Engineer and vice president of the Office of the CTO, which includes Red Hat Research and Emerging Technologies. Prior to joining Red Hat, Steve was the founder of the Hadoop Business and Hadoop Chief Technologist at HP and a Software Architect and Master Inventor at IBM Emerging Technologies. Prior to IBM, Steve worked for a number of consumer facing software startups in the USA and his native South Africa.

Read full bio