Building a Foundation for AI Models
No matter how prevalent AI may seem, we're at a point in time where creating an AI model takes a significant amount of data and effort. Every model requires a big investment of time and money to produce narrowly focused models that can't easily be repurposed. Can we reuse these models? How can we take the reusability and extensibility we've learned in software and bring it to AI?
00:24 — INTRO ANIMATION
Foundation models are machine learning models trained on large data sets that can then be applied to a variety of tasks. And to talk about where foundation models may be taking us, I'm being joined by Sriram Raghavan from IBM Research. Hey, Sriram, how you doing?
Hey Chris, how are you?
I am pumped. I'm so excited to talk about foundation models with you. And well, let me just start off by this notion, we use a lot of data to create and deploy really powerful models. It's a lot of work. It's difficult to then take that model and apply it to other domains. So how can we take more of a standard model and retrain it with smaller amounts of data to make this broad impact?
Awesome. Great question, Chris, and that's exactly why I think foundation models are so exciting. If you think about what AI, we've tried to do in AI, a good lens is to look at data representations. We started off when we tried to do AI in the 1950s, when the term was coined, to say the representation of data was logic, symbols, and rules and facts and will do inferring and reasoning on top. Okay. That was very hard to do because who was gonna write down all of them? It didn't go anywhere. Then we said, you know what? There's data, I have big data. Let's apply machine learning. At that time though, before you could apply an interesting machine learning technique, you had to still do a lot of manual labor to get data into the right representation. And that's where foundation models are very exciting. 'Cause what they are doing is they're enabling you to create a powerful and flexible data representation. So that now once I create that, and that might still require fairly large amounts of data and some compute power. But once I do that, the amount of data, compute and effort needed for my individual AI tasks that I wanna do is much, much less. And that's what's very, very exciting is like almost a paradigm shift in the way AI gets done.
And I know in the AI space we think about you build a model, you train a model, you deploy that model. How do you relate models in the foundation model sense to what you actually deploy after you've done this I'll call it simplified training process?
Now, you actually don't deploy the actual foundation model. You take that foundation model, so the data representation, and using a small amount of task specific data, you then actually create an eventual model that will be deployed into your application or a use case. Let's take NLP, natural language processing, which is where sort of the technique and philosophy behind foundation models was born. It's more broadly applicable now but let's go there because it's useful to understand. So if I wanted to build a summarization model I would give you maybe a thousand summaries. If I was generous. In many use cases you don't even have that much. Just with that you're going to have to build a model, which is a summarization model, that you would deploy. But think about it. What does that model have to do? It has to not only learn what it means to summarize. It has to pretty much learn what English is. And all it has is that amount of data. So the idea with foundation models is let's create a data representation, in this case, a powerful representation let's say for English and then using just a thousand examples I have for summarization, I will fine tune, adapt it to actually create a summarization model which is what will eventually get deployed. That's the real model in the traditional sense that goes inside of an application. It could go inside an email system, help you summarize emails and whatever you want, but the foundation model now is now ready. You can now give it some other example. I will fine tune it, create another model, deploy it in an application.
So that really shows how the ability to take a foundation model, retrain with a smaller amount or train with a smaller amount of data to make a task specific model sounds like you're building tools that aren't just useful for the big scale providers of AI but kind of for everybody. How do you see that working?
So eventually when you talk about AI in an enterprise in the context of a client, a retail client, a manufacturing client, a banking client, we'll build a foundation model on your customer data once with getting the data together, doing that hard work once but now those 50 models and maybe the next 50 models that you want to build they're going to go much, much faster for you. And that's very, very interesting for clients because that's what's slowing down AI a little bit, actually quite a bit for some clients. So the opportunity to build a foundation models and their data accelerates their AI and lets you do the hard work once and capitalize on it over and over again.
That really demonstrates how it is the representation of knowledge and the ability to take that representation and fine tune it for specific tasks, radically accelerates how you get to that end state which is businesses are wanting to be data driven and take the data that they have and make use of that to make smarter decisions and smarter decisions faster. You mentioned Industry 4.0, I'm interested in AIOps. I think there's gotta be some great reuse of the technology here in these new spaces.
Oh absolutely. When you think about what it takes to add value to AIOps, it's not just the instrumentation and metrics from the IT system. There's logs, there's unstructured data. There is maybe past reports from SREs on what troubleshooting activity they took. That's unstructured data. You wanna bring it all together and then provide AI that lets you do more automation, more intelligence and how you run an IT system. So the ability to bring multimodal data, ability to create that representation and then use that very, very quickly is going to have huge opportunities in many, many use cases. But the point is in all of these cases, there is multiple data describing the behavior of the system, multiple types of data describing what human beings have done, what have they done in the past to fix the system. You want to put all the knowledge together in a representation and use it. And that's what the foundation model representation lets you do.
Sriram, this has been great. It's such a complex topic and it's impossible really to do it justice in just a short amount of time but you've really helped us understand better the underlying principles of foundation models and that representation of knowledge. So thank you so much, I really enjoyed this.
Likewise, Chris, thank you.
As we continue to refine and improve our foundation models, what happens next will matter for everybody, not just for the people who deliver AI. Our goal is to make AI commonplace, unremarkable, essentially boring. With widespread applicability and easy to use tools, everyone can embrace AI and harness its potential to address the world's most difficult challenges.
07:14 — OUTRO ANIMATION
Meet the guest
Vice President IBM Research AI IBM
Accelerate ML pipelines and intelligent app delivery
Discover how Red Hat and our certified partners provide solutions to accelerate AI/ML initiatives with confidence.Explore our AI partner ecosystem
How to start your AI/ML journey
Some organizations have implemented machine learning technology but have not seen the expected return on their investment. So what can you do to help ensure success with your AI/ML investment?Read the blog post
More like this
Machine Learning Model Drift & MLOps Pipelines
Like houseplants, machine learning models require some attention to thrive. That's where MLOps and ML pipelines come in.
Bringing Deep Learning to Enterprise Applications
To realize the power of AI/ML in enterprise environments, users need an inference engine to run on their hardware. Two open toolkits from Intel do precisely that.
How Do Roads Become Smarter?
Smart road technology can make travel safer, easier, and more efficient. But how can it make travel enjoyable?
Check out our podcasts
Want to hear more tales from the tech world? Red Hat’s award-winning podcasts feature remarkable stories from makers, coders, and leaders across the industry.
Presented by Red Hat
For 25 years, Red Hat has been bringing open source technologies to the enterprise. From the operating system to containers, we believe in building better technology together–and celebrating the unsung heroes who are remaking our world from the command line up.