AI has existed—at least conceptually—since the 1950s. Like most scientific or technological evolutions, AI development has plateaued, jumped forward, then plateaued again—the cycle of hype, disillusion, and then more realistic progress played out over decades.
Today, AI has reached an inflection point. What was once experimental is now production-ready, and what was once theoretical is now transforming how enterprises operate.
The catalyst? A 2017 research paper from Google, "Attention Is All You Need," introduced the transformer architecture: the foundation upon which today's large language models (LLMs) are built. Transformers allowed AI systems to process vast sets of data and understand context in ways previous approaches couldn't match. Combined with advances in computational power and storage capacity, this breakthrough delivered the ability to generate human-like text, realistic images, and working software code.
But this is just the starting point. Continued work to develop models, perfect inference techniques, and integrate different AI approaches has brought us to systems that can perform complex reasoning and problem-solving, plan and execute actions with autonomy, and learn from their interactions. For enterprises, this opens the door to sophisticated automation and creative solutions that weren't possible even a few years ago.
This guide will help you better understand the models delivering modern innovation, assess your organization’s readiness, choose the right approach, and build a practical roadmap to adopt and scale AI with confidence.
The types of AI models accelerating innovation
Large language models (LLMs) and image-generation models are among those delivering the recent explosive growth of generative AI.
LLMs (such as those from OpenAI, Anthropic, and Meta) are pretrained on massive datasets to process and generate natural language, making them invaluable for customer support automation, marketing copy generation, and more. Image generation models (such as Stable Diffusion, Midjourney, and DALL-E), on the other hand, create visuals from text prompts, fueling innovation in entertainment, marketing, and beyond.
The rise of reasoning and planning models
A new class of reasoning models emerged in 2025, fundamentally changing how AI approaches complex problems. These models use reinforcement learning to develop chain-of-thought reasoning, self-verification, and error correction capabilities, forming the basis for agentic AI workflows.
Enterprises are adopting a multimodel approach, using multiple specialized models rather than one monolithic system, with large-scale reasoning models for complex planning tasks (5–10% of queries) while routing simpler requests to models with 7–13 billion parameters. This pattern can deliver significant savings on inference costs and the computational expense of processing each query while maintaining quality.
AI models process text in chunks called tokens, roughly equivalent to a word or part of a word. Tokens represent the currency of AI: Pricing for AI services is typically calculated per million tokens processed, making token cost a key consideration for enterprise organizations deploying AI at scale. For example, a customer service implementation might route 80% of simple queries to smaller, more cost-efficient models, reserving resource-intensive reasoning models for complex issues that justify the higher per-token cost.
Emerging trends to consider
Businesses increasingly use multiple AI models in concert. Modern models also now support function calling, allowing them to interact with external tools, application programming interfaces (APIs), and databases, transforming them from text generators to action-takers.
One example of this trend is Llama Stack, Meta's open-source AI runtime environment launched in 2024, which standardises how organisations build and deploy multimodel AI systems. Think of it as Kubernetes for AI agents: just as Kubernetes orchestrates containers, Llama Stack orchestrates agents and their providers, offering common APIs for inference, retrieval-augmented generation (RAG), agents, tools, and safety that work consistently across development and production environments.
Open source: A foundation for AI innovation
Red Hat’s AI strategy is deeply rooted in open source, helping enterprises to advance both predictive and gen AI with transparency, trust, and lower costs. By using Red Hat’s open hybrid cloud platforms, organizations can innovate freely while maintaining control over their AI solutions.
Take control of LLMs with open source
While gen AI is changing nearly every aspect of business, from how software is made to how we communicate, it's not uncommon for the models (LLMs and others) used as part of a gen AI capability to be tightly controlled by the service provider. This means it isn't going to be easy for an enterprise to evaluate the capabilities of a gen AI service without specialized skills and significant investments of both money and time.
Without visibility into the datasets that created the model or an understanding of how the model uses that data, organizations are exposed to potential risks related to AI-generated content. What if your code-generation model is trained on copyrighted source code? Does any code generated by that model now also belong to the owner of that copyrighted code?
Many questions like these have not been fully answered, but understanding the consequences is imperative. Enterprises are turning to AI to ensure they have access to and control over their data, and can understand how it will be handled and used.
Red Hat has always believed in the power of open source to propel innovation, and a transparent approach to software development that gives customers control over the choices they make. That same philosophy now extends to AI. Our approach centers on that transparency and choice, and provides the stability and proactive support enterprise organizations need, no matter which model or models they deploy.
Choosing the right model is hard. With thousands of options available, how do you know which will perform reliably in production?