Building a Foundation for AI Models

Technically Speaking with Chris Wright
00:01 — Chris Wright

No matter how prevalent AI may seem, we're at a point in time where creating an AI model takes a significant amount of data and effort. Every model requires a big investment of time and money to produce narrowly focused models that can't easily be repurposed. Can we reuse these models? How can we take the reusability and extensibility we've learned in software and bring it to AI?


00:33 — Chris Wright

Foundation models are machine learning models trained on large data sets that can then be applied to a variety of tasks. And to talk about where foundation models may be taking us, I'm being joined by Sriram Raghavan from IBM Research. Hey, Sriram, how you doing?

00:49 — Sriram Raghavan

Hey Chris, how are you?

00:51 — Chris Wright

I am pumped. I'm so excited to talk about foundation models with you. And well, let me just start off by this notion, we use a lot of data to create and deploy really powerful models. It's a lot of work. It's difficult to then take that model and apply it to other domains. So how can we take more of a standard model and retrain it with smaller amounts of data to make this broad impact?

01:20 — Sriram Raghavan

Awesome. Great question, Chris, and that's exactly why I think foundation models are so exciting. If you think about what AI, we've tried to do in AI, a good lens is to look at data representations. We started off when we tried to do AI in the 1950s, when the term was coined, to say the representation of data was logic, symbols, and rules and facts and will do inferring and reasoning on top. Okay. That was very hard to do because who was gonna write down all of them? It didn't go anywhere. Then we said, you know what? There's data, I have big data. Let's apply machine learning. At that time though, before you could apply an interesting machine learning technique, you had to still do a lot of manual labor to get data into the right representation. And that's where foundation models are very exciting. 'Cause what they are doing is they're enabling you to create a powerful and flexible data representation. So that now once I create that, and that might still require fairly large amounts of data and some compute power. But once I do that, the amount of data, compute and effort needed for my individual AI tasks that I wanna do is much, much less. And that's what's very, very exciting is like almost a paradigm shift in the way AI gets done.

02:28 — Chris Wright

And I know in the AI space we think about you build a model, you train a model, you deploy that model. How do you relate models in the foundation model sense to what you actually deploy after you've done this I'll call it simplified training process?

02:43 — Sriram Raghavan

Now, you actually don't deploy the actual foundation model. You take that foundation model, so the data representation, and using a small amount of task specific data, you then actually create an eventual model that will be deployed into your application or a use case. Let's take NLP, natural language processing, which is where sort of the technique and philosophy behind foundation models was born. It's more broadly applicable now but let's go there because it's useful to understand. So if I wanted to build a summarization model I would give you maybe a thousand summaries. If I was generous. In many use cases you don't even have that much. Just with that you're going to have to build a model, which is a summarization model, that you would deploy. But think about it. What does that model have to do? It has to not only learn what it means to summarize. It has to pretty much learn what English is. And all it has is that amount of data. So the idea with foundation models is let's create a data representation, in this case, a powerful representation let's say for English and then using just a thousand examples I have for summarization, I will fine tune, adapt it to actually create a summarization model which is what will eventually get deployed. That's the real model in the traditional sense that goes inside of an application. It could go inside an email system, help you summarize emails and whatever you want, but the foundation model now is now ready. You can now give it some other example. I will fine tune it, create another model, deploy it in an application.

04:11 — Chris Wright

So that really shows how the ability to take a foundation model, retrain with a smaller amount or train with a smaller amount of data to make a task specific model sounds like you're building tools that aren't just useful for the big scale providers of AI but kind of for everybody. How do you see that working?

04:30 — Sriram Raghavan

So eventually when you talk about AI in an enterprise in the context of a client, a retail client, a manufacturing client, a banking client, we'll build a foundation model on your customer data once with getting the data together, doing that hard work once but now those 50 models and maybe the next 50 models that you want to build they're going to go much, much faster for you. And that's very, very interesting for clients because that's what's slowing down AI a little bit, actually quite a bit for some clients. So the opportunity to build a foundation models and their data accelerates their AI and lets you do the hard work once and capitalize on it over and over again.

05:08 — Chris Wright

That really demonstrates how it is the representation of knowledge and the ability to take that representation and fine tune it for specific tasks, radically accelerates how you get to that end state which is businesses are wanting to be data driven and take the data that they have and make use of that to make smarter decisions and smarter decisions faster. You mentioned Industry 4.0, I'm interested in AIOps. I think there's gotta be some great reuse of the technology here in these new spaces.

05:38 — Sriram Raghavan

Oh absolutely. When you think about what it takes to add value to AIOps, it's not just the instrumentation and metrics from the IT system. There's logs, there's unstructured data. There is maybe past reports from SREs on what troubleshooting activity they took. That's unstructured data. You wanna bring it all together and then provide AI that lets you do more automation, more intelligence and how you run an IT system. So the ability to bring multimodal data, ability to create that representation and then use that very, very quickly is going to have huge opportunities in many, many use cases. But the point is in all of these cases, there is multiple data describing the behavior of the system, multiple types of data describing what human beings have done, what have they done in the past to fix the system. You want to put all the knowledge together in a representation and use it. And that's what the foundation model representation lets you do.

06:30 — Chris Wright

Sriram, this has been great. It's such a complex topic and it's impossible really to do it justice in just a short amount of time but you've really helped us understand better the underlying principles of foundation models and that representation of knowledge. So thank you so much, I really enjoyed this.

06:48 — Sriram Raghavan

Likewise, Chris, thank you.

06:50 — Chris Wright

As we continue to refine and improve our foundation models, what happens next will matter for everybody, not just for the people who deliver AI. Our goal is to make AI commonplace, unremarkable, essentially boring. With widespread applicability and easy to use tools, everyone can embrace AI and harness its potential to address the world's most difficult challenges.


  • Keywords:
  • AI/ML

Meet the guest

Sriram Raghavan

Sriram Raghavan

Vice President
IBM Research AI

Keep exploring

Accelerate ML pipelines and intelligent app delivery

Discover how Red Hat and our certified partners provide solutions to accelerate AI/ML initiatives with confidence.

Explore our AI partner ecosystem

How to start your AI/ML journey

Some organizations have implemented machine learning technology but have not seen the expected return on their investment. So what can you do to help ensure success with your AI/ML investment?

Read the blog post

More like this

Technically Speaking with Chris Wright

Machine Learning Model Drift & MLOps Pipelines

Like houseplants, machine learning models require some attention to thrive. That's where MLOps and ML pipelines come in.

Code Comments

Bringing Deep Learning to Enterprise Applications

To realize the power of AI/ML in enterprise environments, users need an inference engine to run on their hardware. Two open toolkits from Intel do precisely that.


How Do Roads Become Smarter?

Smart road technology can make travel safer, easier, and more efficient. But how can it make travel enjoyable?

Share our shows

We are working hard to bring you new stories, ideas, and insights. Reach out to us on social media, use our show hashtags, and follow us for updates and announcements.

Presented by Red Hat

Sharing knowledge has defined Red Hat from the beginning–ever since co-founder Marc Ewing became known as “the helpful guy in the red hat.” Head over to the Red Hat Blog for expert insights and epic stories from the world of enterprise tech.