Building a foundation for AI models

This video can't play due to privacy settings

To change your settings, select the "Cookie Preferences" link in the footer and opt in to "Advertising Cookies."

Building a Foundation for AI Models

2022 年 4 月 27 日 | Technically Speaking Team AI/ML 伙伴

Mo data, mo problems? Training AI models requires a significant up-front investment in time and resources. Being able to repurpose models for different domains or use cases would make AI technology more accessible to everyone–but repurposing models can be tricky. That’s why AI foundation models represent such a paradigm shift. Join Red Hat CTO Chris Wright and IBM Research AI VP Sriram Raghavan to explore how foundation models change the game for training AI/ML.

脚本

Transcript

00:01 - Chris Wright
No matter how prevalent AI may seem, we're at a point in time where creating an AI model takes a significant amount of data and effort. Every model requires a big investment of time and money to produce narrowly focused models that can't easily be repurposed. Can we reuse these models? How can we take the reusability and extensibility we've learned in software and bring it to AI?

00:24 - INTRO ANIMATION

00:33 - Chris Wright
Foundation models are machine learning models trained on large data sets that can then be applied to a variety of tasks. And to talk about where foundation models may be taking us, I'm being joined by Sriram Raghavan from IBM Research. Hey, Sriram, how you doing?

00:49 - Sriram Raghavan
Hey Chris, how are you?

00:51 - Chris Wright
I am pumped. I'm so excited to talk about foundation models with you. And well, let me just start off by this notion, we use a lot of data to create and deploy really powerful models. It's a lot of work. It's difficult to then take that model and apply it to other domains. So how can we take more of a standard model and retrain it with smaller amounts of data to make this broad impact?

01:20 - Sriram Raghavan
Awesome. Great question, Chris, and that's exactly why I think foundation models are so exciting. If you think about what AI, we've tried to do in AI, a good lens is to look at data representations. We started off when we tried to do AI in the 1950s, when the term was coined, to say the representation of data was logic, symbols, and rules and facts and will do inferring and reasoning on top. Okay. That was very hard to do because who was gonna write down all of them? It didn't go anywhere. Then we said, you know what? There's data, I have big data. Let's apply machine learning. At that time though, before you could apply an interesting machine learning technique, you had to still do a lot of manual labor to get data into the right representation. And that's where foundation models are very exciting. 'Cause what they are doing is they're enabling you to create a powerful and flexible data representation. So that now once I create that, and that might still require fairly large amounts of data and some compute power. But once I do that, the amount of data, compute and effort needed for my individual AI tasks that I wanna do is much, much less. And that's what's very, very exciting is like almost a paradigm shift in the way AI gets done.

02:28 - Chris Wright
And I know in the AI space we think about you build a model, you train a model, you deploy that model. How do you relate models in the foundation model sense to what you actually deploy after you've done this I'll call it simplified training process?

02:43 - Sriram Raghavan
Now, you actually don't deploy the actual foundation model. You take that foundation model, so the data representation, and using a small amount of task specific data, you then actually create an eventual model that will be deployed into your application or a use case. Let's take NLP, natural language processing, which is where sort of the technique and philosophy behind foundation models was born. It's more broadly applicable now but let's go there because it's useful to understand. So if I wanted to build a summarization model I would give you maybe a thousand summaries. If I was generous. In many use cases you don't even have that much. Just with that you're going to have to build a model, which is a summarization model, that you would deploy. But think about it. What does that model have to do? It has to not only learn what it means to summarize. It has to pretty much learn what English is. And all it has is that amount of data. So the idea with foundation models is let's create a data representation, in this case, a powerful representation let's say for English and then using just a thousand examples I have for summarization, I will fine tune, adapt it to actually create a summarization model which is what will eventually get deployed. That's the real model in the traditional sense that goes inside of an application. It could go inside an email system, help you summarize emails and whatever you want, but the foundation model now is now ready. You can now give it some other example. I will fine tune it, create another model, deploy it in an application.

04:11 - Chris Wright
So that really shows how the ability to take a foundation model, retrain with a smaller amount or train with a smaller amount of data to make a task specific model sounds like you're building tools that aren't just useful for the big scale providers of AI but kind of for everybody. How do you see that working?

04:30 - Sriram Raghavan
So eventually when you talk about AI in an enterprise in the context of a client, a retail client, a manufacturing client, a banking client, we'll build a foundation model on your customer data once with getting the data together, doing that hard work once but now those 50 models and maybe the next 50 models that you want to build they're going to go much, much faster for you. And that's very, very interesting for clients because that's what's slowing down AI a little bit, actually quite a bit for some clients. So the opportunity to build a foundation models and their data accelerates their AI and lets you do the hard work once and capitalize on it over and over again.

05:08 - Chris Wright
That really demonstrates how it is the representation of knowledge and the ability to take that representation and fine tune it for specific tasks, radically accelerates how you get to that end state which is businesses are wanting to be data driven and take the data that they have and make use of that to make smarter decisions and smarter decisions faster. You mentioned Industry 4.0, I'm interested in AIOps. I think there's gotta be some great reuse of the technology here in these new spaces.

05:38 - Sriram Raghavan
Oh absolutely. When you think about what it takes to add value to AIOps, it's not just the instrumentation and metrics from the IT system. There's logs, there's unstructured data. There is maybe past reports from SREs on what troubleshooting activity they took. That's unstructured data. You wanna bring it all together and then provide AI that lets you do more automation, more intelligence and how you run an IT system. So the ability to bring multimodal data, ability to create that representation and then use that very, very quickly is going to have huge opportunities in many, many use cases. But the point is in all of these cases, there is multiple data describing the behavior of the system, multiple types of data describing what human beings have done, what have they done in the past to fix the system. You want to put all the knowledge together in a representation and use it. And that's what the foundation model representation lets you do.

06:30 - Chris Wright
Sriram, this has been great. It's such a complex topic and it's impossible really to do it justice in just a short amount of time but you've really helped us understand better the underlying principles of foundation models and that representation of knowledge. So thank you so much, I really enjoyed this.

06:48 - Sriram Raghavan
Likewise, Chris, thank you.

06:50 - Chris Wright
As we continue to refine and improve our foundation models, what happens next will matter for everybody, not just for the people who deliver AI. Our goal is to make AI commonplace, unremarkable, essentially boring. With widespread applicability and easy to use tools, everyone can embrace AI and harness its potential to address the world's most difficult challenges.

07:14 - OUTRO ANIMATION

More about AI/ML

通常指的是能够使用数据来模拟人类智能的流程和算法，人工智能可以基于所获得的知识提供见解。

博客文章

大处着眼，小处着手：为什么聚焦 AI 将在 2025 年取得胜利

2026 年 1 月 11 日

AI/ML

博客文章

关于生成式 AI 大型语言模型 (LLM) 提示词模式的建议

2026 年 1 月 11 日

AI/ML

About the show

Technically Speaking

What’s next for enterprise IT? No one has all the answers—But CTO Chris Wright knows the tech experts and industry leaders who are working on them.