HomeCustomersPricingBlog
Back
  • April 25, 2023
  • 8 min read

How We Built Cages: Wrapping Up

How We Built Cages has been our attempt to invite you in to see what goes into building a generic enclave runtime. Each post individually stepped through the complexity of running services in enclaves and how Cages abstracts it away for you.

While Cages intends to provide a general-purpose enclave environment, there are two tenets that take precedence over all else: Security and Usability. This post is a high-level discussion of how we’ve built with Security and Usability at the core, but if you want to learn more about any of the topics discussed, then check out the previous posts on How We Built Cages.

Security: It’s signatures all the way down

It goes without saying that when building a product around Trusted Execution Environments (TEE) Security is a key element. But there are certain features within Cages that make the definition of Security more interesting.

Firstly, Cages aren’t built on Evervault infrastructure. When deploying to a Cage, a developer performs the build in their own environment (on their machine or in CI). The Cage build, which works wherever you can run Docker, produces a signed Enclave Image File (EIF) and the corresponding Platform Configuration Registers (PCRs) used to attest the remote enclave.

The EIF is uploaded to the Evervault Infrastructure to begin deploying it into a Cage. Deploying an EIF to a Cage is a pretty involved process; you can read more about it here. But glossing over all of the interesting details (seriously, you should read the post), we end up with your EIF running in a Nitro Enclave with an Evervault process running on the host EC2 instance. This lets us route traffic into the Enclave, monitor its health, and load in any runtime configurations (more on this later).

So, why is this model uniquely interesting?

In our infrastructure, the TEE is the untrusted component. We can’t say for certain what’s running in the enclave at any point. Maybe it’s a normal Cage, built using our CLI as expected. But maybe you’ve decided to try to escape the TEE and deployed a custom EIF with a Data Plane that we’ve never seen before. And that’s by design. Attestation means nothing if you don’t know where your image is built. We could serve you back a set of hardcoded PCRs, or we could inject in some code that you haven’t seen before. There’d be no way to know.

Trusting the Untrusted

Our trust model works in two ways — for you to trust your Cage and for us to trust your Cage. Helping you trust the Cage is relatively straightforward. We embed attestation documents into the TLS handshake and validate them within our SDKs. You don’t need a special client. Just tell us the PCRs, and we’ll make sure you can only connect to the Cage when they match. (This gives some cool guarantees about your connection, you can read more about them in our docs.)

But we also need to trust the Cage within our infrastructure. We can’t provision a CA or secrets to the Cage without knowing for sure that this is what you intended to run. So how can we make sure that an unknown process is intended to receive the sensitive information it's requesting?

I metioned earlier that there’s an Evervault process running on the host EC2 instance. This process is deployed from the Evervault infrastructure, so we can trust it (despite not being in a TEE, go figure). This process has access to an mTLS client certificate, which ties its network requests to internal services to the Cage that runs in the Enclave. We use our trust in this process to issue short-lived JWTs, bound to that instance’s IP, to the Cage. The Cage can then send an attestation document containing this JWT to the internal services to retrieve its CA and secrets.

So our provisioning service can verify the signature on the attestation document. If that checks out, we can verify the JWT signature embedded in the document. Then we can confirm that the PCRs we recorded during deployment match the values in the attestation document. And once that’s done, we can issue a CA to the Cage to sign its own attestable TLS certs.

We also use this handshake to provide ephemeral JWTs to your Cage to allow you to encrypt and decrypt your data. Cages integrates with the Evervault encryption platform to make it easy to only have your sensitive data in plaintext within an Enclave. Support for encryption and decryption in Cages doesn’t require any dependencies. We expose a small REST API within the Enclave to proxy traffic to our internal encryption engine. Additionally, any Evervault encrypted data sent into your Cage will be automatically decrypted, so your webserver can expect to receive the plaintext without any code changes.

Usability: If a tree falls in an attestable forest…

I think TEEs are cool (maybe that’s obvious at this point). But I also think having no ability to access the outside world can make it hard to use TEEs for meaningful applications, sort of like pure functional programming.

Cages tries to offer the security model of TEEs but with some additions to make them more usable. One of these is the convenience of building in Docker, and another is simplified deployments. You can go from a Dockerfile to an enclave in about 15 minutes (just enough time to make a really elaborate coffee).

But beyond making it easy to get your code into an enclave, we’ve put a lot of effort into making it straightforward to run your code in an enclave. If you need to run a small web server that takes in a customer’s information, calls an API to provision a card number for them, and returns it to you, it should just work. Or maybe you need to call a Cage and have it download an ID, send it to a KYC API, and return the result. Or you want to take in an encrypted private key, and sign a transaction before sending it to a gateway.

All of these services should work out of the box in a Cage despite relying on calling third parties.

Cages runs several proxies in the Enclave, which let your service perform DNS queries and open TCP connections to remote servers as normal. Without getting too in the weeds (because we already have a post for that), our DNS proxy lies to your service and resolves every IP to loopback. We then have a TCP proxy on loopback which sends your traffic to the host process to then send to the remote. At this point, I should probably mention that we support whitelisting, which is enforced both in the enclave and on the host.

In some cases, your Cage won't need any internet access. To prevent any unwanted egress, we release two binaries for every version of our Cages runtime. One which supports networking from the Cage, and another which uses rust conditional compilation to filter out any networking-related code. We also disable the listeners on the host instance that proxy the traffic out to the internet, which removes a pretty convoluted footgun.

Security + Usability = Accessible Enclaves

Our aim with Cages has always been to make the security benefits of TEEs more user-friendly. It shouldn’t be a major engineering effort to deploy a container to an Enclave with all of the benefits of a TEE available by default. Teams also shouldn’t have to choose between security and usability. Hopefully, you agree.

Cages is under very active development. We’re regularly updating the system to expose more control to the end developer to further the security of the product or to support new use-cases and features. We’ve recently added egress allow-listing, further scoped our internal authentication handshake, and added dashboards for resource consumption metrics. We’re also rolling out support for granular controls over Cage signing certificates, allowing team administrators to only allow their Cage to be deployed after being signed by a predefined certificate.

That being said, we’d love your feedback. If you want to play around with Cages, you can sign up for our 14-day free trial. Or, if you’re curious about how it all works, you can read our earlier posts or take a look at the source code.

Related Posts