Protecting intellectual property and proprietary artificial intelligence (AI) models has become increasingly important in today's business landscape. Unauthorized access can have disastrous consequences with respect to competitiveness, compliance and other vital factors, making it essential to implement leading security measures.
Confidential computing is one of these technologies, using hardware-based trusted execution environments (TEEs) to create enclaves with strengthened security postures. These enclaves help protect sensitive data and computations from unauthorized access, even by privileged software or administrators. Data is encrypted while in use. Confidential computing adds an additional layer of protection to your multi-layer defense-in-depth strategy, better safeguarding your sensitive data and intellectual property.
CNCF Confidential Containers (CoCo) project provides a platform for building cloud-native solutions leveraging confidential computing technologies. If you have a requirement to protect your Kubernetes workload by running it inside a trusted execution environment then CoCo is the ideal choice.
Enkrypt AI is building solutions to address growing needs around AI compliance, privacy, security and metering. As businesses increasingly rely on AI-driven insights, confirming the integrity, authenticity and privacy of the AI models and the data becomes paramount and is not fully addressed by current solutions in the market. This blog will discuss how Enkrypt AI uses the CNCF CoCo project as the foundation towards enhancing security of model deployment, which is one of Enkrypt AI’s key focus areas.
The Confidential Containers (CoCo) Project
The goal of the CoCo project is to standardize confidential computing at the pod level and simplify its consumption in Kubernetes. This enables Kubernetes users to deploy confidential container workloads using familiar workflows and tools without extensive knowledge of underlying confidential computing technologies.
With CoCo, you can deploy your workload on infrastructure owned by someone else, which significantly reduces the risk of unauthorized entities accessing your workload data and extracting your secrets. For this blog, we focus on the Azure cloud infrastructure. On the technical side, confidentiality capabilities are achieved by encrypting the computer’s memory and protecting other low-level resources your workload requires at the hardware level.
For more information on the CoCo threat model, the Kata containers project (which CoCo uses extensively), CoCo architecture and main building blocks, we recommend reading Deploying confidential containers on the public cloud.
The following diagram presents the up-to-date architecture of the CoCo project (November 2023):
For the current high-level blog, we will not cover all components. However, we need to mention a few so the reader can understand later sections.
What are peer-pods?
The peer-pods approach enables the creation of Kata virtual machines (VMs) in an environment by utilizing the environment's native infrastructure management application programming interfaces (APIs), such as Microsoft Azure APIs, when creating Kata VMs on Azure. This contrasts with the alternative approach of creating Kata VMs using local hypervisor APIs (e.g. QEMU/KVM). The cloud-api-adaptor sub-project of the CNCF confidential containers project implements the peer-pods approach.
The following diagram shows the major components involved in the peer-pods solution:
A summary of the main components:
- Remote hypervisor support: Enhancing the Kata runtime to interact with the infrastructure provider APIs (using cloud-api-adapter) instead of directly calling local hypervisor APIs.
- Cloud-api-adaptor: Implements the Kata VM lifecycle management on third-party infrastructure by utilizing respective infrastructure APIs. For example, Azure cloud provider support in cloud-api-adaptor is used to manage the Kata VM lifecycle in Azure cloud.
- Agent-protocol-forwarder: Enables communication between the Kata runtime running on the worker node and the remote Kata VM over TCP.
Why are peer-pods important for this discussion?
Peer-pods are crucial in our discussion because they enable us to create Azure confidential VMs (CVMs) for running Kubernetes pods in trusted execution environments provided by AMD SEV-SNP or Intel TDX.
Attestation and key management components
In CoCo, attestation involves using cryptography-based proofs to protect your workload from tampering. This process helps validate that your software is running without any unauthorized software, memory modification, or malicious CPU state that can compromise your initialized state. In short, CoCo helps confirm that your software runs without tampering in a trusted environment.
The Attestation Agent (AA) is a process on the VM (peer-pod VM in our case), providing a local facility for secure key release to the pod. It is part of guest-components. It gathers the TEE evidence to prove the confidentiality of its environment. The evidence is then passed to the Key Broker Service (described below), along with the request for a specific key.
The Key Broker Service (KBS) is a discrete, remotely deployed service acting as a Relying Party. It manages access to a set of secret keys and will release those keys depending on the authenticity of the Evidence provided by the AA and conformance with predefined policies. KBS is a remote attestation entry point that integrates the Attestation Service (described below) to verify the TEE evidence.
The Attestation Service (AS) is a component responsible for verifying that the Evidence provided to the KBS by an Attester is genuine. AS verifies the evidence signature and the Trusted Computing Base (TCB) described by that evidence. Upon successful verification, it will extract facts about the TEE from the given Evidence and provide it back as a uniform claim to the KBS. It can be deployed as a discrete service or integrated as a module into a KBS deployment.
Given that we have an application running inside a confidential pod (backed by a confidential VM) requiring a secret key, the following diagram describes the CoCo attestation workflow:
Following are the steps (marked by numbers):
- Within a peer-pod confidential VM, the confidential pod reaches out to the AA in Guest Components for obtaining a secret key
- The AA requests the secret key from the KBS
- The KBS answers with a cryptographic nonce which is required to be embedded in the Evidence so this particular exchange cannot be replayed
- The AA sends the evidence to the KBS
- The KBS will verify this evidence with the AS
- If the verification is successful the KBS releases the key to the AA
- The AA then provides the key back to the confidential pod to continue it’s flow
For more details on the topic of attestation we recommend reading the blog series Learn about Confidential Computing Attestation.
For a deeper understanding of how attestation happens in practice on Azure cloud for the case of AMD SEV-SNP (which this work is based on), we recommend reading Confidential Containers on Azure with OpenShift: A technical deep dive.
AI workloads in the public cloud and confidentiality
The public cloud offers the following benefits for running AI workloads:
- Scalability: AI models, especially deep learning ones, require significant computational power. Public clouds can instantly provide the necessary resources without any upfront capital expenditure. You can also remove those resources once the work is done
- Data storage: AI requires vast amounts of data. Public clouds offer vast storage solutions that are both flexible and cost-effective
- Advanced tools and services: Cloud providers like AWS, Azure and Google Cloud offer specialized tools for AI and machine learning, making it easier for developers to build and deploy models. One such example is using advanced GPU capabilities
- Global reach: Public clouds have data centers across the globe, allowing AI services to be deployed closer to end-users, reducing latency
- Collaborative development environment: The cloud fosters a collaborative workspace. Teams can work simultaneously on AI projects, share resources and iterate rapidly. This collaborative approach accelerates development cycles and promotes knowledge sharing
However, with the benefits come a few challenges, especially regarding the confidentiality of the sensitive data used for training and protecting the trained model. Understanding the specific confidentiality requirements of different workloads is important. Let's delve into which AI workloads demand stringent confidentiality and why.
Which AI workloads require confidentiality and why?
Not all AI workloads require stringent confidentiality, but those dealing with sensitive data certainly do. Here's why:
- Medical diagnostics: AI models that predict diseases or suggest treatments handle sensitive patient data. Breaches can violate patient privacy and trust.
- Financial forecasting: Models predicting stock market trends or credit scores deal with confidential financial data. Unauthorized access can lead to financial losses or unfair advantages.
- Personal assistants: AI-driven personal assistants have access to personal emails, schedules and preferences. Ensuring confidentiality is crucial to protect user privacy.
- Autonomous vehicles: These vehicles collect real-time data about their surroundings and users. Ensuring data confidentiality is vital for user trust and safety.
In essence, while AI integration with the public cloud amplifies its capabilities, understanding the nuances of different workloads and their confidentiality requirements is crucial for ethical, secure and efficient operations.
In subsequent sections, we'll see how Enkrypt AI is leveraging the CNCF CoCo project to safely and securely deploy AI workloads.
Using Enkrypt AI with CoCo for AI workloads
What is Enkrypt AI?
Enkrypt AI is building solutions to address growing needs around AI compliance, privacy, security and metering. As businesses increasingly rely on AI-driven insights, ensuring the integrity, authenticity and privacy of the AI models and the data becomes paramount and is currently not fully addressed by solutions in the market.
Enkrypt AI's solution enables the confidentiality and integrity of the AI models, when deployed in third-party infrastructures, including VPCs and edge devices. Enkrypt AI leverages state-of-the-art cryptography techniques, such as homomorphic encryption coupled with the CNCF CoCo project for their solution.
Homomorphic encryption relates to a form of encryption allowing computations to be performed on encrypted data without first decrypting it. The output of the process is also encrypted; however when decrypted, the results are the same as performing all the work on unencrypted data. The name "homomorphic" comes from algebra homomorphism which is a structure-preserving map between two structures of the same type. In our case, encryption and decryption are homomorphisms between the unencrypted and decrypted data.
This approach prevents potential attacks on processing decrypted data and is typically leveraged to process data in cloud environments where the data is always encrypted. To emphasize, even the cloud provider admins aren't able to decrypt or manipulate this data since they have no access to the keys.
When we say computation over encrypted data, what this actually means is performing boolean operations (yes/no) or arithmetic circuits which are a standard computational models for computing expression consisting of variables, coefficients and operations such as addition, subtraction, multiplication, etc. These are the building blocks we use when performing in practice encryption and decryption (such as with private/public keys).
Going back to homomorphic encryption, in practice, there are a number of types based on what computations they can actually perform (partially, somewhat, leveled). For our discussion here we focus on Fully Homomorphic Encryption (FHE) which lets you perform the most computations over encrypted data.
FHE plays a pivotal role for AI workloads in ensuring that data remains encrypted even during computation. This unique property of FHE enables AI models to be authenticated without ever exposing the underlying data. Previously, FHE has been applied to data and Enkrypt AI now applies this to model weights. FHE, like most common cryptographic schemes, generates a public and private key (the public key does the encryption and the the private key is used for the decryption). Securing the private keys is critical for the Enkrypt AI solution.
Although FHE offers significant privacy and security benefits, it does come with its own overhead:
- Computational overhead for performing the operations we previously mentioned
- Storage overhead: When encrypting data with FHE it typically becomes larger than its plaintext counterpart due to encoding methods that obscure patterns and structures
Enkrypt AI overcomes the computational overhead challenges associated with FHE by selectively encrypting parts of the AI model. This approach drastically reduces the computational overhead and latency associated with full-model encryption, while still maintaining a high level of security and verifying that only the authorized and permitted users can make sense of the model outputs (essentially a usable model).
Enkrypt AI employs a risk-based approach to determine which parts of the model to encrypt. This means that only high-risk components, such as those containing sensitive information or critical to the model's performance, are prioritized for encryption. This selective encryption strategy not only reduces the computational and latency costs but also decreases the size of the encrypted model files, making them more manageable for storage and transmission. This overcomes the storage overhead challenges with FHE. A typical example of this would be to encrypt the final layers of the model (those critical for fine-tuning), ensuring that the output from a partially encrypted model always stays encrypted.
To summarize, the value Enkrypt AI brings to the table is a solution providing a balance between security, storage capacity and processing speed, addressing FHE effectively while mitigating the computation and storage challenges FHE also creates.
The Enkrypt AI solution can be divided into 3 parts:
- Model owner (model developer): converting a trained model to a secure model
- Model user (end user who wants the model deployed on their compute infrastructure): loading a secured model and interacting with it (pushing data and getting back results)
- Enkrypt AI key manager: managing the private keys for the decryption
The following diagram shows how these parts come together:
Note the following:
- The model is first encrypted with enhanced security and deployed by the model owner
- The model owner is the one pushing the private key into the Enkrypt AI key manager
- The model user is the one sending the requests with the encrypted output to be decrypted with that key
- Raw data is sent back to the model user after decryption
For additional details on the Enkrypt AI solution see White Paper - Enkrypt AI.
So how does CoCo come into the picture?
The Enkrypt AI key manager is a workload which is potentially vulnerable to key extraction by a malicious infrastructure admin. In the previous section there is one basic assumption that the private keys can be safely stored and used inside the Enkrypt AI key manager. If we could assume that the Enkrypt AI key manager is running in a fully isolated and protected environment the solution is fine as it is. In practice, however, that isn’t the case, especially as we look at third-party cloud deployments.
A second challenge is protecting the AI model and any sensitive data used for the AI workload. For instance, with a mental health chatbot, the data entered by users is highly sensitive and the model itself needs to be secured to prevent tampering.
The CoCo solution solved these problems as follows:
- Protecting the Key Manager: By running the Enkrypt AI key manager inside a confidential container we can make sure the cloud provider can’t access the private keys.
- Protecting the AI workload: By running the model user inside a confidential container we can also make sure the data and model are protected.
Protecting the Key Manager
As described in the previous sections, the critical element of the Enkrypt AI's solution is the Enkrypt AI key manager. CoCo is used for securing the Enkrypt AI key manager code and protecting the keys managed by it, even when in use.
The Enkrypt AI key manager is deployed as a confidential container inside a trusted execution environment to protect the code and the keys at runtime.
The resulting deployment is shown in the following diagram:
Protecting the user data and AI workload
There are scenarios when it is feasible to deploy the complete model inside a confidential container, such as for traditional machine learning (ML) models and non-GPU accelerated workloads. In such cases, Enkrypt AI uses CoCo to deploy the model within a trusted execution environment. In addition, Enkrypt AI’s in-house SDK client makes sure that the data used for inference is always encrypted and only decrypted at the end-user's side, providing end-to-end privacy and security for the entire inference workflow.
The resulting deployment is described in the following diagram:
Enkrypt AI's software development kit (SDK) streamlines AI model deployment by placing them in a confidential container, safeguarding them from unauthorized access. It handles permissions, necessary attestations and verifications, simplifying the deployment process.
The SDK also takes care of encryption, key management and decryption, making it user-friendly for sending inputs and receiving outputs more securely.
From a user's perspective, data security is paramount. Both input and inference output remain encrypted, with keys accessible only within the security-enhanced CoCo environment. The AI model's integrity is guaranteed and can be verified by authorized parties.
Demoing the use case
Demo video providing an overview of components constituting a confidential containers solution on Red Hat OpenShift.
Demo video showing secure key retrieval by a sample confidential container workload.
Demo video showing a sample confidential containers deployment with encrypted container image.
This video demonstrates AI model protection using Enkrypt AI running inside an AMD SEV-SNP Trusted Execution Environment with OpenShift confidential containers on Azure
In this article, we introduced the CNCF confidential containers project, covered a few of the key CoCo building blocks (peer-pods, KBS, AS etc.) and then looked at how confidential containers provide the foundation to protect the AI workloads in the public cloud.
We then focused on how Enkrypt AI is solving their customer challenges around model management and protection by enabling secure key management and tamper-proof machine learning (ML) deployments using CoCo.
Related blog series
A blog series on Confidential Containers
About the authors
Tanay is working in the area of large language model security, privacy and governance. He is a key software engineer at Enkrypt AI, responsible for the work on productizing confidential containers for AI workloads.
Pradipta is working in the area of confidential containers to enhance the privacy and security of container workloads running in the public cloud. He is one of the project maintainers of the CNCF confidential containers project.
Suraj Deshmukh is working on the Confidential Containers open source project for Microsoft. He has been working with Kubernetes since version 1.2. He is currently focused on integrating Kubernetes and Confidential Containers on Azure.
Jens Freimann is a Software Engineering Manager at Red Hat with a focus on OpenShift sandboxed containers and Confidential Containers. He has been with Red Hat for more than six years, during which he has made contributions to low-level virtualization features in QEMU, KVM and virtio(-net). Freimann is passionate about Confidential Computing and has a keen interest in helping organizations implement the technology. Freimann has over 15 years of experience in the tech industry and has held various technical roles throughout his career.
Magnus has received an academic education in Humanities and Computer Science. He has been working in the software industry for around 15 years. Starting out in the world of proprietary Unix he quickly learned to appreciate open source and has applied it everywhere since. He was engaged mostly in the niches of automation, virtualization and cloud computing. During his career in various industries (mobility, sustainability, tech) he exercised various contributor and leadership roles. Currently he's employed as a Software Engineer at Microsoft, working in the Azure Core organization.