This blog is an adaptation of our keynote presentation at PyTorch Day India.
In the debate between open source and proprietary technology, open source wins — especially in the AI arena. However, as the generative AI era continues, enterprises face a new version of an old challenge. While the industry is moving at breakneck speed, much of the underlying infrastructure remains fragmented or locked behind proprietary gates. If AI is to be the key to unlocking unprecedented potential, it must be open at every layer—from the datasets and training pipelines to the infrastructure and the serving layers.
At Red Hat, our vision for this open future is clear: any model, any accelerator, any cloud. To make this a reality, we’re betting on open source communities focused on making maximum impact, like the PyTorch community. PyTorch is the engine that drives flexibility, scalability, and—most importantly—accessibility for every enterprise, regardless of their hardware or cloud provider. To ensure that AI innovation remains democratic, we must protect the user's agency to choose the tools and hardware that best fit their specific needs.
Building for agency, democracy, and optionality
Today, much of the AI conversation centers on massive, "frontier" models. While impressive, these models can be unwieldy and expensive to manage, often requiring the latest, most power-hungry GPUs. This creates a barrier to entry that stifles innovation. The path forward is paved by composable intelligence, and to enable this within the community, Red Hat created the vLLM Semantic Router. By acting as an intelligent traffic controller that routes requests based on intent, this tool allows organizations to achieve high-level reasoning using more accessible, efficient infrastructure. This is how we make agentic development a practical reality for the open source community, moving the power of AI away from centralized monoliths and toward a more distributed, collaborative model.
The heart of any AI strategy is the inference server — the layer where the model actually meets the user. To allow for openness and high-performance within the inference layer, Red Hat has become a primary driver of vLLM, one of the most critical projects within the PyTorch Foundation. Our work with vLLM is dedicated to enabling as much portability in the software serving the model as the model itself — done in the spirit of scaling AI and meeting the needs of the global enterprise. What does this look like from a practical standpoint? In short: prioritizing day zero vLLM enablement as soon as models are released, looking beyond single-node performance, and co-founding llm-d. By disaggregating core components like prefill and decode, we’re helping to build a scalable, open standard for model serving that reflects the distributed, hybrid nature of the modern enterprise.
Beyond software, hardware scarcity and cost remain some of the biggest hurdles in AI. Software should never be sacrificed for agency or act as a bottleneck for hardware choice, which is why we champion a future in which PyTorch can run seamlessly across any environment and invest in the development of a PyTorch ecosystem that supports an array of accelerators. This includes:
- vllm-cpu: In collaboration with fellow industry leaders, we’re bringing high-performance inference to existing CPUs. This is vital for use cases where cost and power are significant constraints.
- OpenReg: This project creates a framework that allows for the rapid enablement of new, power-optimized hardware without requiring complex changes to the PyTorch core.
- Advanced kernels: Through projects like Helion and Triton, Red Hat is helping simplify how developers target different accelerators regardless of the underlying silicon.
Together, these investments ensure that the PyTorch community remains the most versatile and hardware-agnostic home for AI development.
Hardening PyTorch for the global enterprise
For AI to move from experimental labs to mission-critical production environments—the kind that run the world’s banks, airlines, and hospitals—it needs industrial-strength reliability. As the third-highest global contributor to the PyTorch project Red Hat’s focus isn't just on new features, it’s on enterprise hardening. Our teams have fixed over 60 issues within Torch.Compile to ensure stability during heavy workloads and have integrated Red Hat Enterprise Linux (RHEL) into the official PyTorch upstream CI. When we contribute to PyTorch, we are doing so to ensure that every organization can deploy AI with the same confidence they have in their operating system.
The principles of access, agency, and freedom allow us to build on each other's ideas and advance technology faster than any single organization could alone. Whether it’s any model, any accelerator, or any cloud, we are committed to building the open, enterprise-ready foundations the world needs to reach its full potential.
저자 소개
Steve Watt is a Distinguished Engineer and leads the Red Hat Office of the CTO, which includes the Research, Emerging Technologies and Open Source Program Office organizations. Prior to joining Red Hat, Steve was the founder of the Hadoop Business and Hadoop Chief Technologist at HP and a Software Architect and Master Inventor at IBM Emerging Technologies. Prior to IBM, Steve worked for a number of consumer facing software startups in the USA and his native South Africa.
Sudhir Dharanendraiah is a senior AI engineer at Red Hat.
유사한 검색 결과
How does real-world AI deliver value? The Ask Red Hat example
Scaling Earth and space AI models with Red Hat AI Inference Server and Red Hat OpenShift AI
Technically Speaking | Build a production-ready AI toolbox
Technically Speaking | Platform engineering for AI agents
채널별 검색
오토메이션
기술, 팀, 인프라를 위한 IT 자동화 최신 동향
인공지능
고객이 어디서나 AI 워크로드를 실행할 수 있도록 지원하는 플랫폼 업데이트
오픈 하이브리드 클라우드
하이브리드 클라우드로 더욱 유연한 미래를 구축하는 방법을 알아보세요
보안
환경과 기술 전반에 걸쳐 리스크를 감소하는 방법에 대한 최신 정보
엣지 컴퓨팅
엣지에서의 운영을 단순화하는 플랫폼 업데이트
인프라
세계적으로 인정받은 기업용 Linux 플랫폼에 대한 최신 정보
애플리케이션
복잡한 애플리케이션에 대한 솔루션 더 보기
가상화
온프레미스와 클라우드 환경에서 워크로드를 유연하게 운영하기 위한 엔터프라이즈 가상화의 미래