Agent-ready AI means token-ready strategy

May 13, 2026Chris Wright3-minute read

Modern IT is driven just as much by economic decisions as it is by technical concepts. The costs of hardware, cloud services, storage, and more all factor into how CIOs and IT leaders budget and deploy their strategies. And now with AI, we have another cross-disciplinary decision to incorporate: Token economics, in this case, how our AI strategies consume the volatile costs of underlying AI models.

Currently, most enterprise AI relies on calling frontier model APIs and paying for tokens consumed and generated. While this is an easy starting point, the math is changing. Token consumption is skyrocketing because new reasoning models often consume 10 to 20 times more tokens than standard models just to "think" through a problem.

As we move into the era of AI agents that iterate, call tools, and chain tasks, this consumption compounds exponentially. To thrive in this new economy, organizations must evolve from consuming tokens to providing them. This means that success is predicated on owning your inference infrastructure, routing model queries to the most cost-effective endpoint, and even running self-hosted models optimized for your specific business needs.

At Red Hat, we view this journey as a path from "Metal to Agents". It requires a fully integrated, open stack where every layer—from the physical AI accelerators to the agents themselves—is connected and built with system security front and center. This foundation must support a diverse ecosystem of hardware, including NVIDIA, AMD, and Intel, as well as custom silicon from major cloud providers. Then, on top of this hardware layer sits the AI infrastructure, starting with security-centric Linux and Kubernetes environments that deliver consistent reliability, whether on a server rack or from orbiting satellites.

The beating heart of an AI system is inference, which is the determining factor for scaling AI strategies. Red Hat’s leadership in projects like vLLM and our work on distributed inference with llm-d means we are uniquely experienced in optimizing model execution and GPU utilization at the software level. In real-world applications, we have already seen these technologies deliver a 10x reduction in time-to-first-token and 3x improvement in output. Without control over both performance and cost, organizations will eventually be forced into trade-offs that neither finance teams nor customers will accept.

AI models, however, don’t know your business differentiators unless you make them. This is why Retrieval Augmented Generation (RAG) and fine-tuning have made AI a true differentiator. Businesses can connect models to unique internal documentation and customer history, creating models that genuinely understand the specific expertise and domain knowledge of your business.

All of this is now table stakes. The current frontier is agent services. Agents are no longer experiments; they are the core of modern enterprise strategy. But they bring a "Bring Your Own Agent" challenge where developers, data scientists, and marketing teams are all using different tools, from LangChain to OpenClaw. An effective strategy must support this choice while maintaining rigorous IT control. This means giving every agent a verified identity, driving lifecycle management for versioning and rollbacks, and using emerging standards like MCP Services to connect agents to tools and data without creating security gaps.

We see this vision in action with organizations like BNP Paribas, which has generated nearly $600 million in value by industrializing 1,000 AI use cases on a unified platform. They transformed GPU provisioning from a weeks-long bottleneck into a minutes-long service, proving that speed and digital sovereignty can coexist. Similarly, NASA Marshall Space Flight Center has adopted these unified platforms to move thousands of legacy workloads into containerized environments, reducing deployment times from days to minutes to support mission-critical space-borne operations.

These customers are turning AI strategies from being solely focused on efficiency and cost savings into growth drivers. Yes, we want to be more efficient with AI, but just focusing on that is reductive. The next leap forward for AI is aligning it to growth; not just clearing the bottom line but moving the top line higher.

Ultimately, the goal of an enterprise AI strategy should be that when the market moves again (it will), you own the platform underpinning what matters to YOU. You don’t need a forced choice between frontier model power and control/governance security. By embracing an open, integrated stack, you can have both. You can provide the model access your teams need while maintaining a security posture that your IT team can actually defend. This is the only way to build a strategy that compounds in your favor, turning the rapid pace of disruption into a long-term competitive advantage.

About the author

Chris Wright

Chief Technology Officer and Senior Vice President, Global Engineering

Chris Wright is senior vice president and chief technology officer (CTO) at Red Hat. Wright leads the Office of the CTO, which is responsible for incubating emerging technologies and developing forward-looking perspectives on innovations such as artificial intelligence, cloud computing, distributed storage, software defined networking and network functions virtualization, containers, automation and continuous delivery, and distributed ledger.

During his more than 20 years as a software engineer, Wright has worked in the telecommunications industry on high availability and distributed systems, and in the Linux industry on security, virtualization, and networking. He has been a Linux developer for more than 15 years, most of that time spent working deep in the Linux kernel. He is passionate about open source software serving as the foundation for next generation IT systems.

Read full bio

Browse by channel

Explore all channels

Agent-ready AI means token-ready strategy

About the author

Chris Wright

More like this

Browse by channel

平台

Tools

Try, buy, & sell

Communicate

About Red Hat

Change page language

Red Hat legal and privacy links

Red Hat legal and privacy links