This video can't play due to privacy settings
To change your settings, select the "Cookie Preferences" link in the footer and opt in to "Advertising Cookies."
Defining sovereign AI with open source ft. Jered Floyd
What happens to your business operations when your AI workflows run on a proprietary black box? Red Hat CTO Chris Wright and Jered Floyd break down the 4 pillars of sovereign AI and challenge common misconceptions about data ownership and outline a clear path to platform autonomy.
Transcript
Transcript
00:00 - Chris WrightIn the tech industry, we love a good catchall term. For years, it was digital transformation. But if you've been at any of the big tech conferences lately, you know the focus has shifted. Today, it's all about sovereignty. To some, it sounds like business jargon or just a policy check-box. But for the architect, sovereignty is about agency. So, joining me to cut through the noise is Jered Floyd from our field CTO team. Welcome to "Technically Speaking", where we explore how open source is shaping the future of technology. I'm your host, Chris Wright. Jered, great to have you here.
00:38 - Jered Floyd
Thank you.
00:39 - Chris Wright
So, imitation is the highest form of flattery. I stole from you this reframing of digital transformation into sovereignty. Tell me a little bit about why you frame it that way. What's important about that reframing, and more specifically, what's happened in the last 12, 18 months that's really brought this conversation to the foreground?
01:01 - Jered Floyd
Yeah, for sure. So, the reason I've taken that framing is because digital transformation, something we've talked about for years, it's gone through a peak and then we've moved on to newer topics, is a way of bringing together a whole bunch of technology shifts that have happened across computing. So, we talked about things like containerization, about microservices, about DevOps, all things that we just do today. But they really held together to drive a fundamental shift in how applications were built. And so, I think digital sovereignty looks the same way because it's about fundamental changes in how we manage data, how we manage autonomy around ownership of that data, operation of the software, development of the software, and then meeting regulatory rules and feeling assured about the integrity of our systems. Like, you saw some fines around things like laws like GDPR, but this is not so much about, "Oh, we're worried we're going to get fined." This is really about risks to just overall operation to the business. But also, there's a lot of understanding now of, "Oh, we had risks we didn't know before about our ability to operate, but also our ability to innovate." And so, by applying these technologies, these companies can move more quickly by bringing more know-how in-house, by bringing more operational assurance in-house and build new things faster. So, it's kind of a win-win. You get to mitigate some risk, but you also get to do more for your business.
02:31 - Chris Wright
Yeah, that's a nice combination of new capabilities and moving quickly with the rapid-changing demands while at the same time de-risking your business. Usually, it's trade offs, so this is an interesting way to take a position forward with agency, ownership, or control over your software stack and operations and all the things we'll get into, that also brings you capabilities and speed, which is an interesting shift. And I know there's a whole set of conversations around what exactly makes up sovereignty. Before we jump into that, I'll hear terms, sovereign cloud, sovereign AI. Are they the same? Is one a superset of the other? How do you approach the conversation between different labels with sovereignty?
03:24 - Jered Floyd
Yeah, so we have four pillars that we categorize sovereign operations into, or sovereign capabilities into. But for sovereign cloud versus sovereign AI, I mean, you can take any sort of technology space. Cloud, cloud is being able to run your capabilities outside of your data center. Sovereign means that you have levels of control over that. The same thing with AI. Are you able to run your AI technologies in your data center, or at least in an AI data center that is in the same region, the same country, under the same control domain? And so, the way we've been thinking about sovereignty overall is in four pillars. So, data sovereignty, operational sovereignty, technology sovereignty, and assurance sovereignty. If you look at data sovereignty, that has to do with, where is your data being stored and processed? Am I shipping it to another country which may have different laws where it may be subject to stronger or weaker laws? That's really important, especially because the data is the lifeblood of any modern business. So, having any one risk associated with that, huge challenge. Also, having that leak, having that be stolen, huge, huge, huge challenge. The second area, operational sovereignty, is who has control over your systems? If they're running in your own data center, you at least have physical control over those systems. You may have outsourced the operations to a third party. So, again, you have to be concerned about where that third party is, what access, what control they have to both the technology as well as, going back to the first one, the data. Then, I'd say technology sovereignty is, who has ownership over the technology that you're using? And this is an area where open source is absolutely critical. Because with technology, okay, so if I'm using a closed-source product, I have no ability to know where it's being built, make modifications to it, even audit that it's doing what I think it's supposed to be doing. With open source, almost all open source is global. The Linux kernel is developed in probably close to every country. I don't know if every continent. It'd be interesting if there are any Arctic or Antarctic software developers working on the Linux kernel.
05:41 - Chris Wright
The penguins, for sure.
05:43 - Jered Floyd
Yeah, definitely! So, with technology sovereignty, then you have the ability with open source to know that you have the ability to make changes at any time. You have the ability to continue operating even if your provider disappears. So, if you can't have a particular business provider anymore, you still have the software. And that's super critical to having that overall sovereignty, as well as assurance. And so, getting to assurance, assurance is, how do I know that I can meet these regulations? How do I know that the software's doing what it says? How do I know that the people who have access to my software actually have the access that they're claiming they have? And again, that's an area where you can do audits, where you can do audits against both the code itself, which is very authoritative, or your processes, slightly less authoritative, but at least you can test them. And then, overall assurance about, some of these regulations, for example, require that you actually test your ability to leave a service provider. I can't quite imagine, I only have about a dozen VMs in a major cloud provider, and I can't even imagine how long it would take for me to move those. Large financial institutions have to show and prove that they can do that right now.
07:04 - Chris Wright
Yeah, part of DORA has this. How quickly can you repatriate a workload from potentially failed infrastructure to another infrastructure? And it's essentially instantaneous in terms of what the expectations are. And so, certainly, automation becomes really important, but I think that layout of operational, technical, and these are aspects that I think don't always come to the foreground. Data, I think, is a pretty natural understanding, but there's operational data as well as the customer data. And that, I think, gets a little lost in this conversation. I'm glad you bring it to the foreground. I was speaking with a customer who was talking about a sovereign zone from a global hyperscaler. This sovereign zone required authentication into the physical access, into the colocation area with somebody in the US. Like, "Oh, I think there's some operational concerns, there," or running this infrastructure in a sovereign area, but taking the operational data into a different geography. These are the interesting corner cases that I don't think always surface. But I really like this view of open source as a tool in the toolbox of sovereignty. And of course, who doesn't love open source, right? We're here talking about open source all the time, but used in this context. And the interesting twist of the software license for open source comes from the project itself. And you have licensed agency over the software as you use open source software, quite independent from the business relationship you have from a provider. I think that's a really unique role that open source plays in this entire stack. I often think of, as we shift into the AI world where AI and data are talked about hand-in-glove with one another, and data being such a typical part of the sovereignty conversation. The AI stack, has this next-generation software stack feel, and today's generation is very much about Linux and applications. That next generation is about inferencing in production environments with PyTorch and vLLM, and maintaining that openness, the ability to see what's inside the auditing, understanding how it should work, and the licensing. I think these are really important aspects of sovereignty. How do those conversations resonate with users, with a bank that's looking at a regulatory question about resiliency, for example?
09:55 - Jered Floyd
I see stories all the time about businesses basically being held ransom for licensing renewals because a software license, traditional software licenses, you can use this for the next year. And then, if you're really unlucky, it just automatically shuts off and then you can't do anything at all, or you're just out of compliance with the license and possibly subject to huge fines. With open source licensing, you have the right to use the software with subscriptions. You might not get access to updates, but you still have the right to continue using the software and take over responsibility yourself, or find another provider that has more appropriate terms for your business. I think, from an AI perspective, that's also really interesting talking about data, because data is the core of every AI model. You don't use AI except for the purposes of processing your data, inferencing your data, trying to figure out trends in your data, and respond to requests. And so, all of the data sovereignty topics that we talked about are directly applicable to AI. And there, you're thinking about, again, where's my AI running, right? Is my AI co-located with my data? Am I sending my super critical data to a data center that happened to be able to get all of the accelerator chips that I needed or has the models that I want to use that aren't available to run on my own, in my own data center? Those are the things that you have to start thinking about when you're building your architecture. And for most of the critical business processes where you're applying generative AI, you don't need a chatbot. Instead, you start looking at smaller models. And some of these smaller models, smaller open-weight models, for example, you can run in your own data center. And some of these smaller models, you can even do the training yourself. So, there's a whole spectrum here of completely black-box LLM, which, a frontier model, which is super cool, but I don't know anything about how it was trained. I don't know if there are backdoors in its training. I don't know what data went into it, what biases went into it. Then, you have the smaller models that are open-weight models that often publish all the content that went into it, an inventory of the content that went into it, and those, I can run myself. And so, that gives me a little more sovereignty around the operations and the data, what's going in, what's going out. I know all those flows. And then, you start looking at the technology sovereignty side of this, and you look at being able to train small models yourself that you know absolutely everything. You know that there are no backdoors where it's going to send data off or insert a particular bias in a particular way. So, there's a whole spectrum of what fits your business need and what level of sovereignty, and then, what level of innovation you can apply to that that you might not have thought about before when you were just like, "Oh, it's all tokens. I just use the model that I think looks coolest and does the neatest things, and I'm just spending tokens." But now you get to think about, "Well, how's that gonna affect me next year? How's that going to affect me in five years? And where is that data going and where is that know-how going?"
About the show
Technically Speaking
What’s next for enterprise IT? No one has all the answers—But CTO Chris Wright knows the tech experts and industry leaders who are working on them.