Understanding AI Security Frameworks

Understanding AI Security Frameworks

2026 年 2 月 19 日 | Compiler Team 安全防护

Compiler • • Understanding AI Security Frameworks | Compiler

Understanding AI Security Frameworks | Compiler

About the episode

With AI, traditional security methods don’t apply. Conventional defenses and ways of thinking cannot account for the myriad of attack vectors an AI model can present to a nefarious actor.

Red Hat Principal Product Security Engineer Huzaifa Sidhpurwala breaks down the emerging security frameworks designed for the AI era, and tells us why complacency is (still) the weakest point when securing systems.

Compiler team Red Hat original show

脚本

We quickly pivoted to open source models, open source technology driving these things. Why? So that we could democratize its use.

Mhm.

Right. So it didn't turn into something that only the haves could use and the have nots couldn't.

Yeah.

And I think that's really important. Right. Because with something this... I mean this is a lightning rod. This is phenomenal technology that has huge potential. And to carve off a significant portion of... I'll just say humanity, because they're priced out of it, feels like a disservice to the planet.

This is Compiler, an original podcast from Red Hat. I'm your host, Emily Bock, a senior product manager at Red Hat.

And I'm Vincent Danen, Red Hat's vice president of product security.

On this show, we go beyond the buzzwords and jargon and simplify tech topics.

And in this episode, we're going to take a look at AI security frameworks.

To help us break down this complex topic. We invited a familiar voice back to the show. It's Huzaifa Sidhpurwala, a senior principal product security engineer with Red Hat's product security team. You might remember him from our episode around AI hallucinations. We asked Huzaifa for his take on the primary challenges in AI security today.

I think the biggest challenge is the fact that the industry is really new. And in any technology industry, like, you know, when new things come up. I think, most of the resources, most of the money, most of the manpower, and, you know, most of the things out there are spent on innovation and not really thinking about the security of like, you know, I'm going to innovate this product, I'm going to create this product. 99% of the time, security is an afterthought, which comes into picture only when the product is actually made and the product is out in the market.

People want to move fast, and there's a lot of pressure to do it.

I mean, that's true and Huzaifa is not wrong.

No. Absolutely not. I think that's a problem everyone faces at some point, is where in the narrative does security come into the equation. And usually it ends up being later than it should be.

Yeah. It's Band-Aids.

Band-Aids. Exactly. Treating a symptom, not curing the problem for sure. And I think that's not the only challenge with security in AI.

You know, people who are actually using these products, they really don't know what security for AI is or, you know, what it should be. When you ask them about security of the software they say is it free from security flaws, have you done any scanning tools on this? What happens when an issue is found? What is your SLA in fixing all of those issues? And, you know, all of the usual questions. A lot of them don't really translate to AI systems. And, you know, AI applications. So they are asking about model signing, model encryption, model hallucinations, something which we spoke about last time and all of those things. A lot of times those things may not be applicable to them at all. Like, so they are not asking... they probably do not know what right questions you need to ask.

That's a little bit of a square peg/round hole sort of situation when it comes to security for AI. It's hard to know what things we've done before will work for it and what things won't.

Well, yeah, I mean, we've talked about that before as well, right? Like we look at things in those old ways of the things that we know. We're not thinking critically or creatively about the things that we don't know. Like, we're not asking ourselves, what don't I know? What are the questions that I should be asking?

Yeah. It's like we've done this laundry list of things before now. Which things (holding up little paint swatches), which things will match with AI. But we could stand to be a little more creative about it too.

We could, but I think one of the interesting things is, you know, Huzaifa sees opportunity where a lot of other people see roadblocks.

We have reached a stage in which, you know, the technology is still coming up, and there's still a lot of scope to inject security into it. That's one thing. And the second thing is, the industry is opening up to open source way of doing AI.

Well, anything open source, you know, we're going to have some stuff to talk about.

100%. And I think he actually makes a really good point there about, the industry opening up to it. I mean, if you think about the genesis of open source and the proprietary software that was, you know, it was the big thing, right? Open source came along, the beginning very infantile, I guess. Might be a good word. Not, not awesome, but it was free, and the people were motivated to work on it. And you fast forward, you know, a couple decades. And what's everybody using? They're using open source? Right. So when you think about like these AI models, it initially came out, at the beginning. They're all proprietary black boxes. Nobody knew what was going into them, how they operated, etc.. Very new, very novel. But we quickly pivoted to open source models, open source technology, driving these things. Why? So that we could democratize its use.

Mhm.

Right. So it didn't turn into something that only the haves could use and the have nots couldn't.

Yeah.

And I think that's really important. Right. Because something this I mean this is a lightning rod. This, this is phenomenal technology that has huge potential. And to carve off a significant portion of... I'll just say humanity because they're priced out of it feels like a disservice to the planet. Right?

Exactly. And I think it's a disservice to the technology too, like, I think we've proven time and time again that the more brains on something, the faster typically it goes, and the more things that it will think of in the process. So democratizing AI in an open source kind of sense, I think just kind of benefits the whole thing too.

Completely. And if you look at like the speed of innovation with AI right now. Right. Part of that speed is billions of dollars. I mean, like, there's no question, right? Money is driving this thing forward. But at the same time, the fact that open source is there and involved in participating is causing that same acceleration that we saw with operating systems and web technologies in the cloud. All of those things we saw is happening with AI again, and that means we're going to get better and faster quality models and results that we wouldn't maybe otherwise get.

Exactly. Like, I think really the only thing that can stand toe to toe with finances is passion. And we have that in spades.

Yeah. And if we take it back to actually to the security piece as well, like in a black box model, like sure, some, some security might be in there, but you don't know what it is and you don't know how trustworthy it is. So we talked about trust before. Right. When you have it out in the open, anyone can look at it. Anyone can use it. Anyone knows how it's built. The trust is there in just the technology around it because now I can see it.

Yeah, I think that speeds adoption too. Like, if you can see how something works, you're going to be a lot more willing to use it.

Oh, and it is low cost attempts to try it, right? I don't have to worry about spending thousands of dollars on tokens for proprietary models. When I could stand up something locally, fiddle around with it, prove it out, and then go, okay, maybe now I want a better model and I can pay for it and I want to pay for it. But that testing process doesn't require it.

A try before you buy kind of thought process. I think that makes a lot of sense. And you know I think a lot of that sounds really promising. But what are some concrete examples of those emerging security practices.

Well, I think you're looking at I mean, some of the things that Huzaifa mentioned before, right. Like things like model signing, for things like model cards and just telling people what the security model or, what the security model of the model is. Right? Like, you're basically telling people upfront what some of the security capabilities are. And you're, you're looking at things like developing better guardrails. Right. Better software too as you're fine tuning models, how can you bake some security in there to make sure you're tuning it the way that you intend to tune it, and letting your end users know kind of what that criteria was. Right. So on the surface of it, a lot of it is that technology itself, like, how do we make these things in a way that's more secure, but then also how do we expose that information to the end users so that they know given these dozen... we have hundreds of models now, but which I've picked my top 12. Which one of them am I going to pick? The security is a concern for me. I should be able to have access to some sort of data to help me make a reasonable decision.

Right? It becomes a differentiator in its own right. Kind of like when you're buying a car, you're going to look at the safety specs as part of your decision making process, too. It's not all that different with AI and picking it based on that kind of security stance.

We've been hearing from Huzaifa Sidhpurwala about AI product security. Vincent, it sounds like there are existing security practices around AI and ones that are being developed.

Yeah. That's right. I think I'll let Huzaifa speak to some of them here. First there's model signing.

So model signing is very important. It basically means, is that when models are shipped by the vendors and when models are used by model integrators and customers, we need to make sure that the models are not tampered between them. So what signing basically does is vendors will sign the model and applications or customers who use the model will verify the signature. And if the signature matches, it means that the models have not been tampered with right.

And then there's also model cards. The show has covered this before. It's basically a nutrition label for an LLM.

If you see for example, model card of ANTM models, they are they are PDFs, which we say that all of you worked on for so many years and, you know, 100 engineers worked on this. And, you know, these are the benchmarks which we have done. And everybody tries to invent their own benchmarks. And the reason why they are doing this test, if they use the standard benchmark, probably the model doesn't fail good enough in their benchmark. So they will use something in which they are good at. The model card, which we are trying to work with in the community are cards which are machine readable and machine generate-able, which basically means that like, you know, my system can read the benchmark card and it can really figure out, like, you know, what safety issues this model faced then, you know, whether they have been fixed or they have not been fixed. What was the data which was used to train the model data provenance. Right. From where did this data come from? Are you using some random data from the internet somewhere? And, you know, are you training your model with that data?

Well, that does sound pretty familiar. And it sounds a lot like that kind of standardization needed for trust and transparency, which isn't a surprise. What about testing? How do we assess the safety and security of these AI models, and how does open source fit into all of this?

From a testing point of view, there are a lot of softwares which are available. There is one open source software called Garak which we, which we use internally. What it basically does is it sends prompts to the language model. And it looks at the output from the prompt. So like, you know, it sends a prompt saying, do you know how to make a bomb? And, it looks at the output. And if the output say, sorry, I cannot answer this question or it says, I cannot answer this question right now. Right. There is a difference between both of these answers. In the first case, the language model basically says, I will not do it. In the second case, it says I will not do it now, which basically means that if you try to convince the model a little bit more, then, you know, it may be able to reveal this kind of information. So Garak is basically, programed to kind of figure out what the answer is. And if it fails, then the answer is going to open a road to a particular, unsafe answer at the end. And, you know, it will try to convince the LLM to try to see if it is able to answer the... if it's able to answer an unsafe question. There are a lot of open source, openly hosted safety benchmarks also. So, for example, Hugging Face has got a safety benchmark where you can upload your model and Hugging Face will run those benchmarks internally. You run a couple of tests and the software is programed in a way that, you know, in this particular test, you should have at least 80% to be called safe. And in this particular thing, you should have at least 85% stuff like that. So it's a, it's a whole suite of tests which are normally run and the results are publicly available.

I think that goes back a little bit to what we were talking about around open source and having something publicly available helps build trust. And how you use it, if you can see how it works.

100%. I mean, there's that place where these open standards, like, I was talking about there with respect to, you know, I have information on my model, I tested, you know, in certain ways. It kind of feels like a, you know, trust me, bro, like, you know, like it feels...

Do trust me.

...it feels like that. It's like. Well, I don't know. What's the basis for my trust in you? I don't think I am going to trust you, bro. So I think that at that point, having a standard that all of these models have to adhere to and having a standard set of tests, whether it's security testing or as he was talking about safety testing, all of them go through the exact same thing, and then you have the exact same benchmark and you see how they fare against it. And I mean, at that point, it's the model makers responsibility to go, I'm unsatisfied with how we show up in this test. So there are things that I have to fix; similar as any other sort of test. Right. But this way it's standardized across the ecosystem. So anyone who's looking at it can go, I can actually compare all of these things to each other and understand what I'm getting myself into.

Yeah. And across all the same metrics as well. Like otherwise I was actually really enjoying what Huzaifa was saying around, you just choose your own benchmarks and then you're going to be the best because you're going to pick whichever benchmark your best at and essentially advertise that aspect of your LLM. And that would make it so much more difficult to pick the ones to trust more so because it's apples to oranges versus a standardized set of benchmarks that everyones measuring against.

Totally. And this is the... I mean, not to kind of go back to our roots, but this is the fundamental difference between open source and proprietary.

Mhm.

Right. I mean, with open source, everything is there, you can run it against any, you know, test suite or whatever. With proprietary software, you can't. Right. You can only look at the executable pieces. You can't look at the source code. With open source I can look at executables that I generate. And I can also look at the source code. So there's a, a greater level of scrutiny and testing available for open source. And I think it's the exact same way with these open models and open standards for safety and security benchmarking for these models. Right.

Yeah. It muddies the water on some of that, you know, picking the best metric for you too, because of all these open source ones are measuring against the same standard, it's a lot less acceptable, I guess, to then go, I don't care about those standards. I'm using my own.

Yeah, and I mean that road leads to failure all the time, right?

Yeah, absolutely. That seems like a really comprehensive approach to model testing. But AI systems are not just about the model itself either, right? Yeah.

No, I mean, we, you're right. It doesn't begin or end with the model. Right? There's integrations, there's movements towards you know, agentic AI. All that stuff means that the threat surface is bigger than just the LLM itself.

The model itself is not enough. You need to have AI agents that you know, things which will run processes in the back end. Agents which will, say, get you data from a Google Drive agent, which will allow you to run calculations agents, which will allow you to send email to people, analyze data, write code like, you know, this is basically what AI agents do. It's also important to pay important enough attention to how these agents basically work. And, if you read the internet, there's a lot of discussion about, like, you know, how unsafe these agents are. And you know, what protocols are used to talk between agents and you know, how unsafe they are and all of those things. So, because if you are using agentic workflow, if you are using agents, you have a very good model, you have a very safe model. But that's not enough. You need to make sure that your agents which interact with the model, the agents themselves and the interaction protocol, all of these things, the entire system is secure.

And of course, there is the most critical point in the process, which we've talked about before. People.

The humans are the weakest link in this chain, right? I mean, we have systems which are very secure. And, you know, we have all the infrastructure in place and we have perimeter security, and we have good firewalls and we have good policies and all of those things. We have a good framework which we think we will use. But the people who are really implementing this or, you know, the people who are, looking at the logs, the people who are getting those alerts when something bad happens. We are being complacent, I think that's the word.

So it really comes down to implementing common sense security practices consistently, regardless of how advanced the technology gets. So it sounds like there's a lot of stuff we're already doing in the context of AI and security around and for it. And I think there's, there's... we talked about it at the beginning and then just here at the end, too, I think there's two kind of main outstanding questions that will be the subject of much debate, I'm sure. Part one: what are we not thinking about? Like we've implemented a lot of like the more traditional type product security policies when it comes to AI, but where do those not fit and what can we do about it? And then also what do we do about the people, which is the eternal question when it comes to security?

Well, we haven't even solved that for traditional software use yet. So that one is an open question. But I have some thoughts there. When it comes to, you know, what haven't we thought of? Right. I mean, I don't know, because if I did, I would have thought about it. You know, I mean... all jokes aside, I think you have to start with those fundamental pieces, right? The traditional software security things that are directly applicable to AI. And I'll separate AI systems from AI models, like, we have to we have to put those basics in place. Now AI systems are software that host those models. So traditional application security, software security, product security all apply to those. So we can't discount those. There's nothing weird or novel or new there. It's just... do it. Do what we did before and do it just as well. For the AI, for the AI security piece, this one I've always found interesting because we don't even talk about it in a consistent way. When you're looking at AI security, people will talk about things like the model told me how to make a bomb because I tricked it with grandma's nighttime story. Napalm I guess that's the other one, right? That's not a security thing. It's a safety thing.

Yeah.

Right. When you're looking at security, we're talking about things in the typical CIA triad. So confidentiality, integrity and availability. Me being able to con or social engineer a model into giving me some information doesn't violate or impact any one of those confidentiality, integrity or availability. But it does bear in mind that question of safety, because I think we would all agree that models answering those sorts of questions like, do not train your model on The Anarchist Cookbook, please. Like those sorts of things we would consider to be unsafe. We talked about AI usage as a therapist. You know, there are certain things that a therapist would never recommend. Does the model recommend those things in, can you coerce it to do that in some way. But that's a safety issue, not a security issue. And I think that we have to sit there and really separate the two. Like moving forward, I don't think we're very good at this today, separate the two, have the security engineers like the security people focus on the security part. So like how do we make sure that confidential data stays confidential. How do we make sure that we can't tamper with or taint a model by our prompting. Right. And then make sure that we can't crash the model because maybe it's running some mission critical stuff. And, you know, especially important, whereas these agentic workflows, if I'm going to a model to get instructions for a task that I have to do and I'm, I'm knocking on the door and nobody's home, like, that's a security issue because you've impacted availability.

That... there's a lot to unpack there. And I have we could do a whole episode just on that.

I think safety is a big topic. Yeah.

It's one it's philosophically adjacent to security, which is why I think it enters the conversation so often. There's that aspect of secure safety and also liability, which is kind of the other side of that coin. And it's a very philosophical conversation to have. And it's interesting to see AI driving that forward like it's come up before, for sure, but not quite in the same way.

Yeah. And I mean, and there's more I like, I kind of boiled it down to what I, what I consider quite simple. Right. Because you can throw other things in there, like there's a trust piece as well. Right. But then there's also a bias piece and not all, not all safety things. So we think about say I mean we're both in North America, so the things that we think are unsafe in North America might very well be safe in another country. Or vice versa. So is that safety or is that bias?

Right.

So there's a bunch of things there. And I mean I don't mean to derail us too much, but I think they're, they're things worth considering because as we're looking at what do we call security issues in these models, we have to kind of set some of these things aside and go there slightly different, because when people are thinking about, I don't want to use AI because I feel like it's insecure... it just might be like it's not... it doesn't make me comfortable in some ways, but that doesn't mean it's insecure.

Yeah. Exactly. There's really a distinction between those two. And I think that's taking us a little bit away from the world of product security, because you can't necessarily use those classic tools we have, or even some of the same frameworks that we'll be applying a new and creative ways to AI to address safety concerns. And that's just....

No...

Make or gain.

Well, you can't even use traditional security tools like in the same way for, for AI, right. Like parts of these models have code in them. So you can use traditional, you know, SASS T-scanning or whatever and that code portion of the model. But you can't necessarily do that on the data part of the model.

Mhm.

And the thing that really concerns me, and I think that we have to think about this in the future is with those agentic workflows where there is no human in the loop, like right now, we usually have a place where, you know, I go, ask a bot a question about how to do some system administrative task, and then I have to copy/paste it or but there's an interrupt. I am the interrupt. I have to think about it. Right. Well these are agentic workflows, it's like this thing that the model told this agent to do, the agent is going to go do. There is no human in the loop right. And that's where the security part starts to come really into play. Because if I can have a model that I can coerce or coerce to, to generate a malicious command for somebody else. I mean, if I do something naughty to myself, I mean, I don't need a model to do that. I can go wreak my computer all I want. Right? But if I can coerce a model in some way to answer, let's just say, for example, the next person's question....

Yeah.

...or somebody at the same time, right, where I can actually influence the output given to somebody else irrespective of what they're doing, or I can hide something in the prompt. It's going there to get a certain type of response back that is malicious. It causes damage to their system or opens it up for attack by me. Right? Those are the things that we have to look at.

And now you've put a spanner in the works where that... how fast and efficient AI is at doing things becomes really a double-edged sword, because now it's doing bad things very quickly in your systems.

Totally. And I mean, like, I'm not the AI expert, so I don't know exactly how these things work, but the thing that comes to mind right now is like race conditions, right? Like these are known in software, right? You... and there and there are mechanisms to avoid them. What does a race condition look like when you have hundreds of thousands of people talking to AI bots at the exact same time.

Yeah.

Are there ways that these things can bleed over in some way. Like I don't know.

Yeah. And I my hot take now is you know, people are the problem in a lot of these situations when it comes to security. But it's also where some of the solutions can come from. And the more we educate people, the more comfortable with AI people get, the more ideas we'll have to help try to curtail some of these issues that pop up or stop them from happening, or recognize them when they do.

Yeah. And, you know, sit back critically, think, take the time to critically think and ask questions like not ask questions of AI. Although maybe I can help but like ask questions like go back to the you know, I mean, I've been doing this for a long time. So I think of it as the old school hacker mindset, how can I break this? You know, what happens if and then kind of just following that train of thought and that's where we're going to find novel ways of.... And I mean, don't get me wrong, we have some very novel ways of trying to address some of these problems today. They will just get cooler and more novel in the future. Right? But they exist because people are asking questions.

That's exactly right. And we got to keep doing that. And I think that's actually a really good opportunity. Now, we've given a lot of our own thoughts on all of these topics, but we also want you all to ask questions. We want to know what you all think about AI security. So hit us up on social media at Red Hat and use the #compilerpodcast. If you have some of your own hot takes about AI security frameworks, and I think that'll be it. What do you think?

I think that, that'll do it for this episode of Compiler.

This episode was written by Kim Huang. And a big thanks to our guest who's Huzaifa Sidhpurwala.

The Compiler is produced by the team at Red Hat with technical support from Dialect.

And if you liked today's episode, tell us. Oh and follow and review our show on your platform of choice.

Thanks a lot and see you later.

About the show

Compiler

Do you want to stay on top of tech, but find you’re short on time? Compiler presents perspectives, topics, and insights from the industry—free from jargon and judgment. We want to discover where technology is headed beyond the headlines, and create a place for new IT professionals to learn, grow, and thrive. If you are enjoying the show, let us know, and use #CompilerPodcast to share our episodes.

Understanding AI Security Frameworks

Understanding AI Security Frameworks | Compiler

About the episode

订阅

脚本

More about 安全防护

About the show

Compiler

平台

工具

试用购买与出售

联系我们

关于红帽

Change page language

Red Hat legal and privacy links

Red Hat legal and privacy links

Understanding AI Security Frameworks

Understanding AI Security Frameworks | Compiler

About the episode

订阅

脚本

More about 安全防护

红帽对 OpenPrinting CUPS 安全漏洞 CVE-2024-47076, CVE-2024-47175, CVE-2024-47176 和 CVE-2024-47177 的响应

2024 年 Kubernetes 安全状况

About the show

Compiler

平台

工具

试用购买与出售

联系我们

关于红帽

Change page language

Red Hat legal and privacy links

Red Hat legal and privacy links