What is generative AI?

Published August 7, 2023•9-minute read

Generative AI is a kind of artificial intelligence technology that relies on deep learning models to create new content.

Generative AI applications can produce writing, pictures, code, and more. This is achieved during AI inference, the operational phase of AI, where the model is able to apply learning from training and apply it to real-world situations. Common use cases for generative AI include chatbots, image creation and editing, software code assistance, and scientific research.

People are putting generative AI to use in professional settings to quickly visualize creative ideas and efficiently handle boring and time-consuming tasks. In areas like medical research and product design, generative AI can help professionals do their jobs better and significantly faster. However, generative AI also introduces new risks which users should understand and work to mitigate.

Explore Red Hat AI

If you’ve enjoyed a surprisingly coherent conversation with ChatGPT, or watched Midjourney render a realistic picture from a description you just made up, you know generative AI can feel like magic. What makes this sorcery possible?

Beneath the AI apps you use, deep learning models are recreating patterns they’ve learned from a vast amount of training data. Then they work within human-constructed parameters to make something new based on what they’ve learned.

Deep learning models do not store a copy of their training data, but rather an encoded version of it, with similar data points arranged close together. This representation can then be decoded to construct new, original data with similar characteristics.

Building a custom generative AI app requires a model, as well as adjustments such as human-supervised fine-tuning or a layer of data specific to a use case.

Most of today’s popular generative AI apps respond to user prompts. Describe what you want in natural language and the app returns whatever you asked for—like magic.

Learn how AI can work for the enterprise

Generative AI’s breakthroughs in writing and images have captured news headlines and people’s imaginations. Here are a few of the early use cases for this rapidly advancing technology.

Writing. Even before ChatGPT captured headlines (and began writing its own), generative AI systems were good at mimicking human writing. Language translation tools were among the first use cases for generative AI models. Current generative AI tools can respond to prompts for high-quality content creation on practically any topic. These tools can also adapt their writing to different lengths and various writing styles.

Image generation. Generative AI image tools can synthesize high-quality pictures in response to prompts for countless subjects and styles. Some AI tools, such as Generative Fill in Adobe Photoshop, can add new elements to existing works.

Speech and music generation. Using written text and sample audio of a person’s voice, AI vocal tools can create narration or singing that mimic the sounds of real humans. Other tools can create artificial music from prompts or samples.

Video generation. New services are experimenting with various generative AI techniques to create motion graphics. For example, some are able to match audio to a still image and make a subject’s mouth and facial expression appear to talk.

Code generation and completion. Some generative AI tools can take a written prompt and output computer code on request to assist software developers.

Data augmentation. Generative AI can create a large amount of synthetic data when using real data is impossible or not preferable. For example, synthetic data can be useful if you want to train a model to understand healthcare data without including any personally identifiable information. It can also be used to stretch a small or incomplete data set into a larger set of synthetic data for training or testing purposes.

Agentic AI. Agentic AI and generative AI work collaboratively. Agentic AI systems may use gen AI to converse with a user, independently create content as part of a greater goal, or communicate with external tools. In other words, gen AI is a critical part of agentic AI's "cognitive process."

Explore generative AI use cases

Deep learning, which makes generative AI possible, is a machine learning technique for analyzing and interpreting large amounts of data. Also known as deep neural learning or deep neural networking, this process teaches computers to learn through observation, imitating the way humans gain knowledge. Deep learning is a critical concept in applying computers to the problem of understanding human language, or natural language processing (NLP).

It may help to think of deep learning as a type of flow chart, starting with an input layer and ending with an output layer. Sandwiched between these two layers are the “hidden layers” which process information at different levels, adjusting and adapting their behavior as they continuously receive new data. Deep learning models can have hundreds of hidden layers, each of which plays a part in discovering relationships and patterns within the data set.

Starting with the input layer, which is composed of several nodes, data is introduced to the model and categorized accordingly before it’s moved forward to the next layer. The path that the data takes through each layer is based upon the calculations set in place for each node. Eventually, the data moves through each layer, picking up observations along the way that ultimately create the output, or final analysis, of the data.

One technology that has sped the advancement of deep learning is the GPU, or graphics processing unit. GPUs were originally architected to accelerate the rendering of video game graphics. But as an efficient way to perform calculations in parallel, GPUs have proven to be well suited for deep learning workloads.

Breakthroughs in the size and speed of deep learning models led directly to the current wave of breakthrough generative AI apps.

A neural network is a way of processing information that mimics biological neural systems like the connections in our own brains. It’s how AI can forge connections among seemingly unrelated sets of information. The concept of a neural network is closely related to deep learning.

How does a deep learning model use the neural network concept to connect data points? Start with how the human brain works. Our brains contain many interconnected neurons, which act as information messengers when the brain is processing incoming data. These neurons use electrical impulses and chemical signals to communicate with one another and transmit information between different areas of the brain.

An artificial neural network (ANN) is based on this biological phenomenon, but formed by artificial neurons that are made from software modules called nodes. These nodes use mathematical calculations (instead of chemical signals as in the brain) to communicate and transmit information. This simulated neural network (SNN) processes data by clustering data points and making predictions.

Different neural network techniques are suited for different kinds of data. A recurrent neural network (RNN) is a model that uses sequential data, such as through learning words in order as a way to process language.

Building on the idea of the RNN, transformers are a specific kind of neural network architecture that can process language faster. Transformers learn the relationships of words in a sentence, which is a more efficient process compared to RNNs which ingest each word in sequential order.

A large language model (LLM) is a deep learning model trained by applying transformers to a massive set of generalized data. LLMs power many of the popular AI chat and text tools.

Another deep learning technique, the diffusion model, has proven to be a good fit for image generation. Diffusion models learn the process of turning a natural image into blurry visual noise. Then generative image tools take the process and reverse it—starting with a random noise pattern and refining it until it resembles a realistic picture.

Deep learning models can be described in parameters. A simple credit prediction model trained on 10 inputs from a loan application form would have 10 parameters. By contrast, an LLM can have billions of parameters. OpenAI’s Generative Pre-trained Transformer 4 (GPT-4), one of the foundation models that powers ChatGPT, is reported to have 1 trillion parameters.

A foundation model is a deep learning model trained on a huge amount of generic data. Once trained, foundation models can be refined for specialized use cases. As the name suggests, these models can form the foundation for many different applications.

Creating a new foundation model today is a substantial project. The process requires enormous amounts of training data, typically collected from scrapes of the internet, digital libraries of books, databases of scholarly articles, stock image collections, or other large data sets. Training a model on this much data takes immense infrastructure, including building or leasing a cloud of GPUs. The largest foundational models to date are reported to have cost hundreds of millions of dollars to build.

Because of the high effort required to train a foundation model from scratch, it’s common to rely on models trained by third parties, then apply customization. There are multiple techniques for customizing a foundation model. These can include fine-tuning, prompt-tuning, and adding customer-specific or domain-specific data. For example, IBM's Granite family foundation models are trained on curated data and then provide transparency into the data that’s used for training.

Fine-tuning is the process of training a pretrained model further with a more tailored data set so it can effectively perform unique tasks. This additional training data modifies the model’s parameters and creates a new version that replaces the original model.

Fine-tuning typically requires significantly less data and time than the initial training. However, the process of traditional fine-tuning is still compute-intensive.

Parameter-efficient fine-tuning (PEFT) is a set of techniques that has adjusts only a portion of parameters within an LLM to save resources. You can think of it as an evolution to traditional fine-tuning.

LoRA (Low-Rank adaptation) and QLoRA (quantized Low-Rank adaptation) are both PEFT techniques for training AI models. LoRA and QLoRA both help fine-tune LLMs more efficiently, but differ in how they manipulate the model and utilize storage to reach intended results.

LoRA vs QLoRA explained

Retrieval-augmented generation (RAG) is a method for getting better answers from a generative AI application by linking an LLM to an external resource.

Implementing RAG architecture into an LLM-based question-answering system (like a chatbot) provides a line of communication between an LLM and your chosen additional knowledge sources. This allows the LLM to cross-reference and supplement its internal knowledge, providing a more reliable and accurate output for the user making a query.

Learn more about RAG

As generative AI models become more sophisticated, they grow. Some LLMs can contain hundreds of billions of parameters. Parameters shape an LLM’s understanding of language, and the more parameters a model has, the more complex the tasks it can perform—with greater accuracy. However, more parameters require more processing power.

Rather than adding more GPUs (which can be costly), you can use techniques like vLLM and llm-d to make processing more efficient on your existing hardware.

vLLM is an inference server that speeds up the output of gen AI applications by making better use of the GPU memory.
llm-d is a Kubernetes-native, open source framework that speeds up distributed inference at scale. Both are designed to solve the challenge of serving large generative AI models by focusing on optimizing performance.

Having come a long way in a short time, generative AI technology has attracted more than its share of hype, both positive and negative. The benefits and downsides of this technology are still emerging. Here we provide a brief look at some prominent concerns about generative AI.

Enabling harm. There are immediate and obvious risks of bad actors using generative AI tools for malicious goals, such as large-scale disinformation campaigns on social media, or nonconsensual deepfake images that target real people.

Reinforcing harmful societal bias. Generative AI tools have been shown to regurgitate the human biases that are present in training data, including harmful stereotypes and hate speech.

Supplying wrong information. Generative AI tools can produce made-up and plainly wrong information and scenes, sometimes called “hallucinations.” Some generated content mistakes are harmless, such as a nonsense response to a chat question, or an image of a human hand with too many fingers. But there have been serious cases of AI gone wrong, such as a chatbot that gave harmful advice to people with questions about eating disorders.

Security and legal risks. Generative AI systems can pose security risks, including from users entering sensitive information into apps that were not designed to be secure. Generative AI responses may introduce legal risks by reproducing copyrighted content or appropriating a real person’s voice or identity without their consent. Additionally, some generative AI tools may have usage restrictions.

Unexplainable outputs. Sometimes, an AI model is too complex for a human to understand or interpret–this is called a black-box model. Black-box models can create harmful consequences when used for high-stakes decision making, especially in high-risk industries like healthcare, transportation, security, military, legal, aerospace, criminal justice, or finance. To help solve this, explainable AI (XAI) techniques can be applied throughout the machine learning lifecycle to make outputs more transparent and understandable to humans.

Learn more about explainable AI

Red Hat® AI is our portfolio of AI products built on solutions our customers already trust. This foundation helps our products remain reliable, flexible, and scalable.

Red Hat AI can help organizations:

Adopt and innovate with AI quickly.
Break down the complexities of delivering AI solutions.
Deploy anywhere.

Explore Red Hat AI

Red Hat AI provides access to a repository of third-party models that are validated to run efficiently across our platform. This set of ready-to-use models are run through capacity guidance planning scenarios, helping you make informed decisions for your domain specific use cases.

Learn more about validated models by Red Hat AI

A foundation to keep your options open

Red Hat AI solutions are capable of supporting both generative and predictive AI capabilities. With bring-your-own-model flexibility, there is support for training and fine-tuning foundation models specifically to your business use-case.

A good place to start is Red Hat® Enterprise Linux® AI: platform for running LLMs in individual server environments. The solution includes Red Hat AI Inference Server, delivering fast, cost-effective inference across the hybrid cloud by maximizing throughput and minimizing latency. The AI platform gives developers quick access to a single server environment, complete with LLMs and AI tooling. It provides everything needed to tune models and build gen AI applications.

Explore Red Hat Enterprise Linux AI

Additionally, our AI partner ecosystem is growing. A variety of technology partners are working with Red Hat to certify operability with Red Hat AI. This way, you can keep your options open.

Learn more about our partners

Keep reading

What is explainable AI?

Explainable AI (XAI) techniques, applied during the machine learning (ML) lifecycle, make AI outputs more understandable and transparent to humans.

Agentic AI vs. generative AI

Agentic AI and generative AI explained: Learn how each works, their unique strengths, and how they can collaborate for smarter solutions.

How vLLM accelerates AI inference: 3 enterprise use cases

This article highlights 3 real-world examples of how well-known companies are successfully using vLLM.

What is generative AI?

Red Hat resources

A foundation to keep your options open

The official Red Hat blog

The adaptable enterprise: Why AI readiness is disruption readiness

Keep reading

What is explainable AI?

Agentic AI vs. generative AI

How vLLM accelerates AI inference: 3 enterprise use cases

Artificial intelligence resources

Platforms

Tools

Try, buy, & sell

Communicate

About Red Hat

Change page language

Red Hat legal and privacy links

Red Hat legal and privacy links