What is 3D-Secure and How Do You Implement It?
Learn what 3D-Secure is, its pros and cons, and how you can implement it effectively
Evervault is on a mission to encrypt the web. We’re building encryption infrastructure for developers.
At the core of this infrastructure is E3, the Evervault Encryption Engine. E3 is a simple encryption service for performing cryptographic operations with:
Claude Shannon said to assume the enemy knows the system when designing a cryptosystem. At Evervault, we assume the enemy knows the system and don’t assume the developer does. We hold the corollary of Kerckhoffs’ principle—that a developer should only use cryptosystems about which everything is known—as a first principle, which requires us to share as much information about Evervault as possible.
These are the four sections we’ll cover:
Diffie and Hellman proposed the theoretical construction of public-key cryptography in 1976—preceded by Merkle in 1974. In 1977, Rivest, Shamir, and Adleman provided its first practical implementation.
At the time, the extent to which their RSA cryptosystem would be used across the web was completely unfathomable; the web didn’t exist yet.
It wasn’t until 1991—14 years later—that Tim Berners-Lee shared his outline of the World Wide Web on alt.hypertext. For the following five years—until Elgamal and the team at Netscape published the SSL 3.0 specification in 1996—most data flowed across the internet in plaintext, leaving users vulnerable to anyone with some basic networking hardware and malicious intent.
Earlier efforts at general-purpose encryption (like kerberized TCP and PGP) had taken place, but it was a time when encryption was very nascent and experimental; nothing had won out yet.
By creating SSL (which later became TLS), Netscape successfully encrypted the web’s data channels; but they only encrypted data in transit. Once data reached its destination, it sat there in plaintext—biding its time before a rogue sysadmin, insecure password, or badly-configured network device opened the floodgates and caused a massive data breach.
Modern applications and systems are now a Rube Goldberg machine of TLS, encryption-at-rest, open-source encryption libraries, questionable key management, PEM files, and headache. Simply: encryption was never designed for the web in 2021, and data has never been treated as the ultimate endpoint in security.
The web needs an encryption engine that encrypts data at the field-level, and that all developers can build with.
As developers, we’re used to abstraction and simplicity across our workflow:
All this functionality is integrated through a few lines of code. The same is not true for encryption — the most important security tool developers have to protect their users and datasets. Encryption means that the only parties that can read/write plaintext, readable data are those with access to the keys. Put another way, encryption is the most important security tool because it means that no unauthorized party has access to user data or proprietary datasets—even when a breach occurs or data gets leaked. We’re building the encryption engine for the web to change this. We’re abstracting away the complexities of encryption so that developers can build encrypted apps where data is encrypted at all times, and can still be processed. We’re building towards a web where all data is encrypted end-to-end, without sacrificing any ability to use the data in computation. Plaintext, readable data will never be exposed to an unauthorized party. Core takeaway: The web needs an encryption engine so that data is treated as the ultimate endpoint in security, and so that developers can build encrypted apps. The web will be end-to-end encrypted.
Now that we know why the web needs an encryption engine, let’s consider how we designed E3. At Evervault, we work backwards to ensure consistency across our products and, importantly, across our team. First, we defined the core functionality of E3, then we set four clear design requirements for how it should work. The core functionality was this: E3 is a simple encryption service for performing cryptographic operations with low latency, high scalability and extreme reliability. E3 is where all cryptographic operations for Evervault services —including Relay and Cages —happen. E3 lets Evervault never handle key material in environments that are not provably secure. The four design requirements were these:
Let’s unpack each in turn.
Developers will use Evervault to work with extremely sensitive data and workloads. E3 must be designed with the assumption that developers are going to be processing everything from the mundane: email addresses, IP addresses, and browser versions, to the extreme: cardholder data, location data, and genomics information. It must be provable that customer data and keys are inaccessible to Evervault. This would be a cryptographic verification because it’s easily verifiable and difficult/impossible to tamper with.
E3 will be the foundation of all future Evervault products and services. It needs to be extremely robust and use-case agnostic. E3 must have a simple, yet universal, set of RPC definitions that could be used across all languages we worked in (Rust and Node.js at the time of writing). These must be well-documented and easily understood by everyone working across the Evervault stack. It’s well-observed that frequently changed software can become compounding unstable as new features are added and codebases grow. Old code is stable code. E3 should be built to last from the beginning.
Evervault will be in developers’ “critical path”—every additional microsecond of latency in our systems is an additional microsecond of latency for developers and their users. E3 must perform all cryptographic operations as fast and securely as possible, while minimizing known bottlenecks and optimizing for high throughput and full redundancy.
Above all else, Evervault can never lose keys. Once data is encrypted, it is unusable unless the original encryption keys are present to decrypt it. Losing keys means we would not be able to decrypt our customers’ data on request, so we knew that we needed to design against this and prevent it at all costs. ## Implementation Options After defining our design requirements for E3, we considered implementation options for performing computation on Evervault-encrypted data. There were two potential options:
The first plausible FHE scheme came in 2009 after Craig Gentry published “Fully homomorphic encryption using ideal lattices” which provided the constructions necessary to perform addition and multiplication operations on ciphertext, thereby allowing arbitrary computation on encrypted data.
For our use case, however, it became apparent that FHE was completely impractical given a number of limitations:
TEEs have, for a long time, been the victim of a bad rap from the crypto community: incorrect configurations, loose I/O, and poorly architected systems all make running TEEs correctly very challenging.
We were confident that we could make TEEs work for our specific use case securely and safely—with the right platform choice and design decisions. Some of the approaches to TEEs we considered were:
Intel SGX is a set of extensions to the Intel instruction set that provides guarantees of integrity and confidentiality even when software is run on a machine where all privileged software is potentially malicious, including the kernel and further down the stack. SGX provides a remote attestation feature, where the chipset will generate a signature of the measurement of an enclave which can be attested via the Intel Attestation Server (IAS). This allows a third party to verify that the enclave is running a known binary, as well as a mechanism for establishing a secure channel with the enclave to exchange secret data. It’s a robust solution to the “rogue sysadmin” problem, but it has a number of constraints—notably, the Enclave Page Cache (EPC) is capped at 128MB. In practice, ~35MB of this cache is reserved for enclave metadata. The remaining 93MB is all that enclave binaries can actively make use of (stack + heap).
Enclaves may only be hosted on EPC-allocated memory. The SGX Linux driver does allow enclaves to swap EPC memory for normal memory, but before doing so the driver must encrypt the page with the enclave sealing key (unique to each enclave binary on each specific CPU) to ensure confidentiality. The driver maintains an Enclave Page Cache Map (EPCM) in EPC memory which tracks each page. There is a significant performance cost to this process—potentially many orders of magnitude slower than simply restricting memory usage to the memory available to the EPC, depending on memory access patterns.
The tooling, however, is reasonably robust; there is a lively developer community, and there are a number of high-quality open-source SDKs for working with SGX. There are examples of Intel SGX being used at scale: Signal’s “secure value recovery” is built on Intel SGX.
Deployment, however, is one of the major issues in running SGX at scale. The only major cloud service provider with Intel SGX support is Microsoft Azure through Confidential Computing. Many cloud providers have chipsets based on the Skylake microarchitecture or later, but they have not enabled BIOS support for SGX.
SGX 2 promises to resolve a number of issues that have been uncovered with version 1, especially a number of the 7 (known) major attacks on Intel SGX. It will be rolled out with the Gemini Lake microarchitecture which is currently only available in the latest Intel NUC kits. We will revisit SGX at a later date once version 2.0 is widely available.
If Intel SGX is the scalpel of TEEs, AMD SEV is the sledgehammer. SEV allows virtual machines to have VM-specific encrypted memory, even with a rogue hypervisor. Major cloud service providers have begun to adopt this system, namely Google Cloud with their Confidential VM offering.
SEV provides us with many of the security guarantees we require, such as VM attestation—with the major drawback of VMs having unrestricted I/O. SEV leaves I/O in the hands of the application developer, which means security becomes entirely subject to the I/O restrictions of the VM running with SEV isolation, rather than the I/O restrictions of the application running in the VM. This creates a larger attack surface and would require more ongoing maintenance of kernel and operating system security than what we thought was reasonable.
Similarly, we would have been responsible for managing VMs and CI/CD using our own custom system (potentially Firecracker VM through Weave Ignite). This creates a larger attack surface, which could easily be avoided.
At re:Invent 2019, AWS announced a new product: Nitro Enclaves. Over the preceding years, AWS had begun a gradual transition away from software hypervisors towards their bespoke Nitro System. The premise of Nitro is to migrate the responsibility of resource isolation away from (vulnerable) software to a dedicated piece of hardware with firmware specifically designed for isolation.
Nitro Enclaves is based on a simple idea—that a parent instance (a standard EC2 instance with Nitro Enclaves support) can isolate a portion of its memory and vCPUs to create a Nitro Enclave. The enclaves are isolated using the new AWS Nitro System—much like any other EC2 instance — and I/O is restricted to a single POSIX AF_VSOCK channel with the host instance. Enclave binaries are attested by the Nitro Security Module, and there is built-in support for attestation verification in AWS KMS.
Having a dedicated enclave system integrated as part of a major cloud service provider like AWS is a major win; the service is available in almost all of the AWS regions; there are effectively no resource constraints, and we aren’t susceptible to issues requiring microcode updates (or similar).
Core takeaway: Intel, AMD, ARM, and other chip manufacturers, as well as cloud providers, are doubling down on security by providing Trusted Execution Environments which let developers run code securely in untrusted environments. Many of these TEEs provide useful cryptographic primitives like code attestation.
While Evervault — and, specifically, E3 — will become TEE-agnostic, we decided that AWS Nitro Enclaves offers the most robust platform to begin building on. This section will focus on how we built E3 and will cover how we do crypto, CI/CD, and observability & monitoring with AWS Nitro Enclaves.
As an early design partner for AWS Nitro Enclaves, we interacted with the product in its most larval stages as the team were still designing flows and writing documentation. We had been heavy users of AWS Elastic Container Service (ECS) for our management APIs and internal systems before we built E3. It had served us very well up to this point, and we weren’t in need of many of the more advanced service discovery capabilities that services like Elastic Kubernetes Service (EKS) would have provided. Our infrastructure was reasonably simple, and we had been mostly working on a Node.js monolith up until that point. Most of the recommended Nitro Enclaves flows revolved around EC2 and a custom Nitro Enclaves AMI that had been created by the AWS team. Although our previous ECS deployments had been on Fargate and served us well, we saw an opportunity to merge Nitro Enclaves with our existing infrastructure by deploying our ECS clusters on dedicated EC2 instances running the ECS-optimized AMI with a bootup script to install the Nitro Enclaves driver. This was not overly complex and simply required us to create a new EC2 Autoscaling Group with a startup script to run our custom Nitro Enclaves bootstrap. We migrated our existing ECS services and tasks from Fargate to our new EC2 targets, which was pretty seamless thanks to the new ECS UI. Traffic was flowing to our new enclave-enabled instances, and everything was functioning as it should. We had rolling, zero-touch deploys for our existing codebase — now we needed to figure out how to do the same for Nitro Enclaves. As early adopters, we didn’t have many reference points. We hope our approach will be helpful for developers exploring Nitro Enclaves. We broke E3 into two constituent parts: E3 and the E3 Helper.
The E3 binary itself is what runs the crypto, manages state and handles RPC requests. It runs inside of a Nitro Enclave. The whole E3 codebase is less than 2000 lines of Rust, aside from some changes that we made to Steven Fackler’s rust-openssl FFI crate to align it with the original OpenSSL C implementation. To run a Nitro Enclave, AWS created a Docker-esque bundler which generates a static Enclave Image File (EIF) which is used as the image for the enclave. The E3 binary is compiled for x86_64-unknown-linux-musl with statically linked OpenSSL (quick startup is important to us, and we wanted to avoid re-compiling on the host to include dynamically-linked OpenSSL).
The E3 Helper binary spawns and destroys enclaves (through the Nitro Enclave syscalls), streams logs, fetches state, handles notifications from our core API and communicates with external AWS services. It is E3’s gateway to the outside world.
We dissected the Nitro Enclaves CLI and, alongside scanning through some changes to the mainline Linux kernel by the Nitro Enclaves team, we assembled a new enclave creation flow in our Rust codebase. It directly interfaces with the Nitro Enclaves driver on our new EC2 machines and runs syscalls to create enclaves, fetch metadata and terminate enclaves. We created a Docker image containing the E3 Helper binary and the E3 EIF, as well as a startup script that terminates existing enclaves and spawns a new one based on the bundled EIF. There is currently a limit of one enclave per host instance, although we expect that allowing multiple enclaves per host will be a possibility in the near future. For Evervault, this makes it difficult for us to run multiple applications within an enclave on a single parent instance, and robust separation of concerns becomes increasingly important as a result. The only means of communication between E3 and the E3 Helper is the AF_VSOCK channel, with additional authentication added for any requests that alter state relating to key material. Similarly, application containers interface directly with E3 over the same AF_VSOCK channel, but with a more restricted set of permitted RPC calls. The biggest advantage of using AF_VSOCK for enclave communication is that it is safely isolated from the network and only kernel-to-kernel sockets are allowed. Functionally, AF_VSOCK is very similar to a Unix IPC socket. It is designed for VM ↔ host communication and is used extensively in projects like QEMU. Even in the event of a malicious attacker managing to gain access to our VPC, there is no I/O in the IP address space so any attacks are useless — unless attackers gain access to the parent instance. Even then, the only access a potential attacker would have is to the socket-based I/O that is designed and implemented by the enclave developer.
We created a global RPC framework that contained all of the necessary requests, responses, and error types that we envisaged for future products, as well as a set of serialization methods to prepare the RPC structs for transfer over the socket. Recent trends in serialization pointed us towards systems like Cap’n Proto and Protocol Buffers which we admired for their performance and declarative typing. Many of our requests, however, are loosely typed as they can contain arbitrary data from our users that are provided as JSON, XML and other formats. Network throughput was not a major point of concern as there was no “network” for the data to be streamed through — it was all intra-host data transfer. MessagePack became an obvious choice for us, as it is a simple and efficient way to serialize loosely-structured JSON payloads as a raw byte stream and it is well supported by frameworks like Serde.
Docker is known for its isolation characteristics, specifically relating to networking. One of the potential security implications of the Nitro Enclaves model, however, is that AF_VSOCK sockets are completely accessible even within Docker containers on the same machine, and there is no intuitive way of restricting this aside from having authentication at Layer 4 or higher on client ↔ enclave connections. This may lead to potential security concerns for some deployments. We implemented RPC signing to prevent this. All finalized RPC bytestreams are signed using ECDSA and are verified within the E3 enclave.Core takeaway: E3 runs inside an AWS Nitro Enclave which is accessed only by Evervault services over a local channel and is not exposed to the internet.
AWS Nitro Enclaves provides close bindings with AWS Key Management Service, through strict IAM conditions and a KMS parameter that allows KMS responses to be enveloped with the public key of an enclave, attested by the Nitro Security Module (NSM). All binaries running within a Nitro Enclave have full access to the NSM GetAttestationDoc function through a hypercall which generates a document containing the enclave’s Platform Configuration Registers (PCRs). PCRs are SHA-384 hashes of an enclave’s characteristics, including a checksum of the binary itself, a hash of the binary signing key, parent instance ID, and parent instance IAM role as well as an optional user-provided public key, custom user data, and a custom nonce. The enclave application generates a CBOR-encoded GetAttestationDocument request and uses the NSM driver to request a signed AttestationDocument from the Nitro Hypervisor. The response is a COSE-signed CBOR-encoded AttestationDocument which is signed using one of Amazon’s Private Root CAs. By providing an application-generated RSA public key (RSA-2048, RSA-3072 or RSA-4096) in the AttestationDocumentrequest, this can be used directly with KMS to allow Decrypt responses to be re-encrypted for the enclave as a target—assuming the AttestationDocument matches the IAM conditions in the key’s policy.
In addition to the CiphertextBlob parameter, KMS TrentService accepts an additional Recipient parameter which is a Base64-encoded representation of the signed AttestationDocument bytes. KMS will decrypt the CiphertextBlob and return a response containing a CiphertextForRecipient field which is a PKCS#7 CMS enveloped response. Support for CMS is currently reasonably weak; OpenSSL is one of the only widely-used crypto libraries with support for CMS decryption.
We had to submit some changes to the rust-openssl FFI crate to allow for a NULL recipient certificate (as the AttestationDocument request contains only a public key, not an X.509 certificate). It is worth noting that this change could potentially leave implementations vulnerable to Bleichenbacher’s attack on PKCS#1 v1.5 RSA padding. In our implementation, these concerns are not an issue as there we have no access to any of the decrypt responses or stack traces.
Core takeaway: AWS Nitro Enclaves provides cryptographic primitives that let you encrypt information that can only be decrypted by a specific AWS Nitro Enclave image. CI/CD with AWS Nitro Enclaves All of our Nitro Enclave source code is written in Rust. The reasons for this were:
Early reference SDKs from the Nitro Enclaves team, and The obvious benefits of memory safety, performance, and low-level access to Linux ioctl() calls which are necessary for a number of Nitro/NSM features.
We use GitHub for all of our source control, as well as self-hosted GitHub Actions for deployment flows. Every time a change is made, we compile both the E3 Helper and E3 for the x86_64-unknown-linux-musl target.
We separately have a Dockerfile which bundles the enclave binary and Alpine Linux in order to generate an Enclave Image File (EIF). Once the EIF has been generated, our self-hosted Action signs the binary with our signing key. A second Docker image is then created which contains the Nitro Enclaves CLI, the E3 Helper, and the EIF file. This contains a bootstrap script that terminates existing enclaves running on the EC2 instances, creates a new enclave from the bundled EIF and starts the E3 Helper.
The E3 Helper fetches all necessary encrypted customer data and keys from our core API and passes it to E3 over AF_VSOCK. Once the data is passed into E3, the E3 Helper and E3 generate a checksum to verify that the data was transferred as intended. The E3 Helper then sends a SetReady message to the enclave to confirm it is ready to accept requests and to begin passing the ECS health check.
Core takeaway: E3 is written in Rust, and we use GitHub Actions to sign and deploy the E3 binary to Amazon ECS. Observability and Monitoring of AWS Nitro Enclaves By default, AWS Nitro Enclaves have extremely restricted I/O to prevent data leaks. We didn’t want to interfere with this security model much, if at all. Nitro Enclaves can be run in debug mode (not production-safe), which launches a second AF_VSOCK server on CID 0, port (10000 + CID) and provides a raw byte stream of system logs (stdout) from both the kernel and application within the AWS Nitro Enclave. For our security model, these logs are too verbose and potentially leak information about the data being processed in the error stack of a crypto exception, for example.
To still give us insight into what is going wrong with specific requests — but without leaking data from the enclave — we designed a small number of custom error types that are verbose enough for us to understand the type of error and how it can be fixed, but anonymous enough that they reveal very little, or nothing, about the data itself. The error types were created as a Rust enum and serialized using MessagePack as part of the RPC response.
Separately, we added a GetStatistics method to our RPC which runs every few seconds and returns important performance data like memory usage, CPU utilization, and open TCP sockets. These metrics are then shipped to AWS CloudWatch where they are input metrics for our autoscaling configuration.
Core takeaway: Observability and monitoring with AWS Nitro Enclaves is still relatively unchartered territory, so we manually fetch and ship performance metrics from E3 to CloudWatch. Summary This post showed why and how we’re building E3, the web’s encryption engine. These are the core takeaways:
The Need for Encryption: The web needs an encryption engine so that data is treated as the ultimate endpoint in security, and so that developers can build encrypted apps — which is why we’re building E3. The web will be end-to-end encrypted. E3 Is Where All Cryptographic Operations for Evervault Services Happen: E3 had four design requirements: data encrypted by E3 must be verifiably inaccessible by Evervault; E3 must be use case agnostic for present and future Evervault services; E3 must be low-latency, high-throughput, and fully-redundant; and, above all else, the integrity of keys must be absolutely guaranteed. Implementation Options: We considered two implementation options for performing computation on Evervault-encrypted data: fully homomorphic encryption (FHE) and trusted execution environments (TEEs). Unfortunately, while there have been great advancements in FHE in recent years, it’s not yet fast enough for high-scale, general-purpose use cases. TEEs let developers run code securely in untrusted environments, and provide useful cryptographic primitives like code attestation. AWS Nitro Enclaves is the most advanced and robust TEE, so we decided to build E3 with AWS Nitro Enclaves. E3 Is Written in Rust and Runs Inside an AWS Nitro Enclave: This is accessed only by Evervault services over a local channel and is not exposed to the network. AWS Nitro Enclaves provides cryptographic primitives that let Evervault encrypt information that can only be decrypted by a specific AWS Nitro Enclave image. Observability and monitoring with AWS Nitro Enclaves is still relatively unchartered territory. We designed a small number of custom error types to enable observability and monitoring without leaking data from the enclave.
Given the security requirements we had in mind when we set about building E3, we assumed that building a production-grade system to match them would be a major struggle.
E3 has reached a point where we are confident in sharing how it’s built, and we are delighted to be building Relay and Cages on top of it — all thanks to new advancements in TEEs, a bustling cryptography community in Rust, and robust infrastructure from AWS.
At Evervault, we believe that it’s early for the web and that its future is encrypted. E3 will power it.
Thank you to Jeff Weiner, Tom Killalea, Colm MacCárthaigh, Dylan Field, Erin Price-Wright, Bucky Moore, and William Yap for reading drafts of this, and to the AWS Nitro Enclaves team for their customer-obsessed approach to building Nitro Enclaves and working with us during the closed preview.
Founder, CEO