CertiK applies a range of analytical methods during audits. This includes careful code review by human auditors, bug-finding through fuzzing, static analysis against insecure code patterns, model checking with respect to standard protocols, and more. One of the most interesting methods is to use machine-checked proofs (a.k.a. formal verification). Formal verification can include automatic verification of standard properties, e.g. using model checking, but we also undertake more advanced verification where an engineer specifies custom-made security properties for a particular project and creates proofs of them.
As shown in the figure below, a Web3 application relies on a stack of layers. The business logic of the contract is written in a high-level programming language, which is compiled into bytecode and interpreted by a bytecode interpreter on the blockchain node. The node runs chain-specific code to compute configurations like staking, as well as a consensus algorithm. Finally, the blockchain node runs on physical cloud infrastructure, which involves system software like operating systems, virtual machine hypervisors, trusted execution environments etc.
When discussing Web3 security, the focus is often on the smart contracts at the top of the stack. The layers deeper down are shared between many different Web3 applications, so they have undergone more testing and are less likely to have bugs. But at the same time, that also makes them important: an error in a blockchain node implementation can endanger all applications on that blockchain, and an error in the system infrastructure could endanger the entire Web3 world.
CertiK's work on machine-checked proofs touches on every layer of the Web3 software stack. For example:
In a recently released report, we described how applied formal verification to HyperEnclave, a trusted execution environment developed by Ant Group’s Trust Native Technology team.
A Trusted Execution Environment (TEE) is designed to protect applications from one of the most challenging type of attacks: those which gain control even over the operating system of the computer. Some of the most well known TEEs are Intel’s SGX, Arm TrustZone, and AMD SEV. They work by letting the CPU cryptographically attest that a particular secure application (enclave) was loaded, and then preventing other software from interfering with it.
TEEs have become integrated into many facets of digital life. Disk encryption software like Apple FileVault and two-factor authentication apps like Google Authenticator store the keys in a TEE to keep them safe even if someone steals and disassembles the laptop or phone. And TEEs are also becoming increasingly important in Web3.
Cryptocurrency wallets use them to store crypto keys more securely. Distributed blockchain oracles can use TEEs to increase confidence that data is authentic. Vitalik Buterin proposed using TEEs as a component of a "2-of-3 system" to recover from ZK-proof implementation bugs. And several blockchains, such as LucidiTEE, SubstraTEE, Oasis Network, and AntChain propose using TEEs to provide data privacy for users.
Currently, most TEEs are closed source and tied to particular hardware vendors. For example, in order to use Intel SGX, the application must run on an Intel CPU and be approved and whitelisted by Intel. But hardware is inherently slow to develop, so they have a smaller feature set, and security problems require CPU firmware updates.
By contrast, HyperEnclave is implemented mostly in software, using two widely available hardware features. First, the computer's trusted platform module (which is ordinarily used to implement UEFI Secure Boot) is used to attest that HyperEnclave and a particular set of enclaves are running. Second, HyperEnclave protects the enclaves using the CPU's virtualization extensions (which are ordinarily used by VM hypervisors like VMware, VirtualBox, Hyper-V, KVM, etc). To the computer hardware, HyperEnclave looks like any other VM hypervisor, but to the enclave it provides an SGX-compatible API so that applications written for SGX can easily be adapted to run on HyperEnclave instead. By using standard virtualization technology, it can easily support high-performance applications that interact closely with the operating system.
The most security-critical part of HyperEnclave is called RustMonitor. It's the hypervisor that protects the enclaves from each other and from the operating system. The enclaves and OS execute in virtualization "guest" mode, meaning that each time they try access memory, the memory address is first translated using a page table managed by RustMonitor. By placing the enclaves in separate regions of physical memory, the RustMonitor can ensure that they can never read or overwrite data owned by another enclave. But it is not a simple static separation; some memory must be allowed to overlap so that enclaves can communicate with the operating system, and the mappings are updated to handle page faults.
There is a list of "page table invariants" (written by the original HyperEnclave developers) that specify how the mapping tables are intended to work. For example, the invariants include properties like "a virtual address is mapped to a physical page in the EPC if and only if the virtual address is in the ELRANGE". See the report for a full list. A mistake here could invalidate all the HyperEnclave security guarantees.
The design of the RustMonitor takes several steps to make this code trustworthy. It is kept small — about 3000 lines of platform independent code, plus similar amounts of x86-specific libraries. And it is written in Rust, a memory-safe language, in order to rule out large classes of bugs. But given its importance, one would still wanted a still higher level of security, and so we turned to formal verification.
The RustMonitor page table management code is a good target because it is small but security critical, so, similar to Web3 smart contracts, it is feasible to spend the effort to formally verify it. In this project, CertiK started from the page table invariants and translated them from English into the Coq formal specification language. We then produced a machine-checked proof that whenever the RustMonitor code modifies the tables, the resulting tables still satisfy all the invariant properties. In fact our development went further, because we also want to make sure that the invariants are correctly designed and that we translated them correctly into Coq. To do that we further specify (in Coq) what information the guests are supposed to be able to observe, and use the invariants to prove that no other flows of information are present (a "noninterference theorem"). Together, these proofs give strong assurances that the page table handling was both designed and implemented correctly.
However, proving the correctness of a TEE monitor is not an easy task. System-level programs like hypervisors and OS kernels use low-level code that involves data structures with pointers, bypasses type abstractions, and optimizes for performance. Verifying this kind of program is still an active research problem worthy of papers published in top computer science journals, and proving them down to every line of code requires heroic efforts. For example, seL4 reported 20 person-years for their initial proof and Komodo spent two person-years to verify assembly code corresponding to about 650 lines of C. The code in those projects was written from scratch with the goal of easing verification, unlike HyperEnclave which is already in production and is written in idiomatic Rust. Clearly, given the state of the art, a whole-code verification approach is not economical. Another problem is that the verification tooling for Rust is not as mature as other programming languages. It's possible to reduce the verification burden by instead working on a hand-written "model" of the code instead of on the actual code. One example is the Sanctum TEE, another verification research project. The downside of this approach is that mismatches between the code and the abstract model may weaken or invalidate any security or correctness properties proved about the model.
CertiK is well positioned to tackle this problem. Our two founders, Prof. Zhong Shao and Prof. Ronghui Gu, are the inventors of the Concurrent Certified Abstraction Layers (CCAL) approach to system code verification, which they used to verify first CertiKOS (the world's first verified concurrent OS kernel) and then seKVM (a verified version of the KVM hypervisor) and the Arm Confidential Compute Architecture. The basic idea in CCAL is to divide all the functions in the program into a set of "layers'', write abstract models (similar to Sanctum) of each layer, and then, the critical step, prove that the real code implements the abstract model, which significantly reduces the risk of a mismatch in behavior. The layered approach makes it easier to handle code proofs and concurrency issues compared to trying to verify the entire program as a monolith, and the work on seKVM and Arm CCA has received attention for handling industrial system software, projects which previously would have been too large for formal verification.
The HyperEnclave verification team included personnel who had previously worked on CertiKOS and seKVM, and we tried to leverage that experience to get a good trade-off between security guarantees and resources spent on verification. We use CCAL-style specifications for a "tunable" verification approach: while our framework lets one verify that the function code conforms to the model, we elect to do so only for a subset of functions (49 functions that directly deal with how the page tables are represented in memory). For other ones, we trust that the handwritten translation from Rust to Coq was done correctly, and prove that the Coq model is functionally correct. The choice of where to spend additional resources is guided by how low-level the data representation in the original program was.
Where the HyperEnclave used high-level Rust data structures we directly translated them into Coq, but where it used a low-level representation of page tables in terms of bytes in memory, we spent more effort proving that that is equivalent to a high-level functional representation. In terms of the the structure of RustMonitor, this means that we applied code proofs to the “Memory” module that handles page table representation, used abstract models for the “Enclave” module that manages enclave state, and then combined the resulting models to prove the invariant theorems and top-level security theorem.
Another interesting part of the verification effort is how we dealt with the code proofs. The HyperEnclave code that deals with page tables is low level, and not well handled by any existing verification tools. Instead we developed our own framework for verifying rust code by going via the MIR intermediate language and used that to verify the key functions. Because MIR is a small language, we were able to write an operational semantics for it, and we verify exactly the MIR code emitted by the compiler, which puts the code proofs on a more solid foundation than using an ad-hoc translator.
Overall, this was a substantial effort, which involved around 17,300 lines of hand-written proof scripts checked by the Coq proof assistant. There are also several thousand lines of specification, and a large amount of automatically generated Coq files for the imported MIR code and layer scaffolding. The main components of the proofs are noninterference (6600 lines), code proofs (4200 lines) and page table data representation proofs (4400 lines), in addition to theorems related to the framework for code proofs. They culminate in a set of theorems stating data-flow noninterference for the main HyperEnclave system calls.
Trusted Execution Environments are a foundational technology, both for cloud computing in general and also for Web3 applications. In this project, CertiK applied state-of-the-art formal verification technology to HyperEnclave's most important component, which provides strong assurances that the code behaves as intended and does indeed enforce the desired security properties.
This formal verification effort is only one part of security assurance for confidential computing. It only covers the page management part of the RustMonitor, and the RustMonitor is only one component of the full system. Going forward, we will apply code auditing, testing, and formal verification to other confidential computing components. But by verifying the most central part we have laid a strong foundation for confidential cloud computing.