Modern IT is driven just as much by economic decisions as it is by technical concepts. The costs of hardware, cloud services, storage, and more all factor into how CIOs and IT leaders budget and deploy their strategies. And now with AI, we have another cross-disciplinary decision to incorporate: Token economics, in this case, how our AI strategies consume the volatile costs of underlying AI models.
Currently, most enterprise AI relies on calling frontier model APIs and paying for tokens consumed and generated. While this is an easy starting point, the math is changing. Token consumption is skyrocketing because new reasoning models often consume 10 to 20 times more tokens than standard models just to "think" through a problem.
As we move into the era of AI agents that iterate, call tools, and chain tasks, this consumption compounds exponentially. To thrive in this new economy, organizations must evolve from consuming tokens to providing them. This means that success is predicated on owning your inference infrastructure, routing model queries to the most cost-effective endpoint, and even running self-hosted models optimized for your specific business needs.
At Red Hat, we view this journey as a path from "Metal to Agents". It requires a fully integrated, open stack where every layer—from the physical AI accelerators to the agents themselves—is connected and built with system security front and center. This foundation must support a diverse ecosystem of hardware, including NVIDIA, AMD, and Intel, as well as custom silicon from major cloud providers. Then, on top of this hardware layer sits the AI infrastructure, starting with security-centric Linux and Kubernetes environments that deliver consistent reliability, whether on a server rack or from orbiting satellites.
The beating heart of an AI system is inference, which is the determining factor for scaling AI strategies. Red Hat’s leadership in projects like vLLM and our work on distributed inference with llm-d means we are uniquely experienced in optimizing model execution and GPU utilization at the software level. In real-world applications, we have already seen these technologies deliver a 10x reduction in time-to-first-token and 3x improvement in output. Without control over both performance and cost, organizations will eventually be forced into trade-offs that neither finance teams nor customers will accept.
AI models, however, don’t know your business differentiators unless you make them. This is why Retrieval Augmented Generation (RAG) and fine-tuning have made AI a true differentiator. Businesses can connect models to unique internal documentation and customer history, creating models that genuinely understand the specific expertise and domain knowledge of your business.
All of this is now table stakes. The current frontier is agent services. Agents are no longer experiments; they are the core of modern enterprise strategy. But they bring a "Bring Your Own Agent" challenge where developers, data scientists, and marketing teams are all using different tools, from LangChain to OpenClaw. An effective strategy must support this choice while maintaining rigorous IT control. This means giving every agent a verified identity, driving lifecycle management for versioning and rollbacks, and using emerging standards like MCP Services to connect agents to tools and data without creating security gaps.
We see this vision in action with organizations like BNP Paribas, which has generated nearly $600 million in value by industrializing 1,000 AI use cases on a unified platform. They transformed GPU provisioning from a weeks-long bottleneck into a minutes-long service, proving that speed and digital sovereignty can coexist. Similarly, NASA Marshall Space Flight Center has adopted these unified platforms to move thousands of legacy workloads into containerized environments, reducing deployment times from days to minutes to support mission-critical space-borne operations.
These customers are turning AI strategies from being solely focused on efficiency and cost savings into growth drivers. Yes, we want to be more efficient with AI, but just focusing on that is reductive. The next leap forward for AI is aligning it to growth; not just clearing the bottom line but moving the top line higher.
Ultimately, the goal of an enterprise AI strategy should be that when the market moves again (it will), you own the platform underpinning what matters to YOU. You don’t need a forced choice between frontier model power and control/governance security. By embracing an open, integrated stack, you can have both. You can provide the model access your teams need while maintaining a security posture that your IT team can actually defend. This is the only way to build a strategy that compounds in your favor, turning the rapid pace of disruption into a long-term competitive advantage.
About the author
Chris Wright is senior vice president and chief technology officer (CTO) at Red Hat. Wright leads the Office of the CTO, which is responsible for incubating emerging technologies and developing forward-looking perspectives on innovations such as artificial intelligence, cloud computing, distributed storage, software defined networking and network functions virtualization, containers, automation and continuous delivery, and distributed ledger.
During his more than 20 years as a software engineer, Wright has worked in the telecommunications industry on high availability and distributed systems, and in the Linux industry on security, virtualization, and networking. He has been a Linux developer for more than 15 years, most of that time spent working deep in the Linux kernel. He is passionate about open source software serving as the foundation for next generation IT systems.
More like this
Red Hat and Netris bring multi-tenant networking to sovereign AI clouds and neoclouds
Red Hat Device Edge now available to run on NVIDIA Jetson Orin
Collaboration In Product Security | Compiler
Keeping Track Of Vulnerabilities With CVEs | Compiler
Browse by channel
Automation
The latest on IT automation that spans tech, teams, and environments
Artificial intelligence
Explore the platforms and partners building a faster path for AI
Cloud services
Get updates on our portfolio of managed cloud services
Security
Explore how we reduce risks across environments and technologies
Edge computing
Updates on the solutions that simplify infrastructure at the edge
Infrastructure
Stay up to date on the world’s leading enterprise Linux platform
Applications
The latest on our solutions to the toughest application challenges
Original shows
Entertaining stories from the makers and leaders in enterprise tech