Project Everest (opens in new tab) is a multiyear collaborative effort focused on building a verified, secure communications stack designed to improve the security of HTTPS, a key internet safeguard. This post, about the verification tools and techniques the Everest team is using and developing, is the first in a series exploring the groundbreaking work, which is available on GitHub (opens in new tab) now.
Wouldn’t it be great if a message you sent to your bank over the internet was guaranteed to be safe from tampering and readable only by your financial institution? Project Everest is building software that provides such a guarantee as a theorem about the code that implements a secure communication protocol deployed in web browsers and servers everywhere.
Microsoft Research Blog
Microsoft Research Forum Episode 3: Globally inclusive and equitable AI, new use cases for AI, and more
In the latest episode of Microsoft Research Forum, researchers explored the importance of globally inclusive and equitable AI, shared updates on AutoGen and MatterGen, presented novel use cases for AI, including industrial applications and the potential of multimodal models to improve assistive technologies.
“Proving theorems about programs has been a dream of computer science for the last 60 years or more, and we’re finally able to do this at the scale required for an important, widely deployed security-critical piece of software,” says Microsoft Senior Researcher Nik Swamy (opens in new tab), a member of the Project Everest team.
The security of internet communications crucially depends on a variety of cryptographic algorithms and protocols. The most widely used among these falls under the umbrella of the Transport Layer Security (TLS) protocol (opens in new tab). TLS is used for secure web browsing via HTTPS, email, Voice over IP, instant messaging, and many other kinds of communication. Unfortunately, TLS and its many implementations have been attacked repeatedly (opens in new tab) over its 25-year history.
Project Everest (opens in new tab) is an ongoing collaboration started in 2016 with researchers from Microsoft Research Redmond (opens in new tab), Microsoft Research Cambridge (opens in new tab), Microsoft Research India (opens in new tab), the Microsoft Research-Inria Joint Centre (opens in new tab) in Paris, Inria (opens in new tab), Carnegie Mellon University (opens in new tab), and The University of Edinburgh (opens in new tab). Growing out of several prior Microsoft Research projects, including Ironclad (opens in new tab), miTLS (opens in new tab), and F* (opens in new tab), Everest aims to develop and deploy efficient, verified, open-source implementations of the entire TLS stack and related protocols, formally reducing the security of the code to the assumptions about the hardness of certain cryptographic problems (opens in new tab).
For Jonathan Protzenko (opens in new tab), a Microsoft researcher on the Everest team, the project’s open collaboration is special.
“Everest makes for a tight interaction between industrial and academic research,” he says. “Our members frequently visit each other, co-author papers together, and send students from one institution to the other over the summer. Several of our members studied at or later moved to these institutions. In that sense, for me, Project Everest truly represents the ideal of open, collaborative research.”
Project Everest is halfway through its projected five-year arc, and its verified components are beginning to replace the current infrastructure with proven, secure software. For instance, Everest’s HACL* (opens in new tab) library provides verified cryptographic primitives for Mozilla Firefox (opens in new tab), for the WireGuard VPN (opens in new tab), and for the Tezos blockchain (opens in new tab). And within Microsoft, Everest’s miTLS protocol stack (opens in new tab) powers the primary implementation of the QUIC transport protocol. The Everest team expects to announce further deployments in the coming weeks. Meanwhile, Everest code is already open source and is developed publicly on GitHub (opens in new tab).
Formal Verification of Software
Formal verification involves using software tools, including various kinds of theorem provers and proof assistants, to analyze all possible behaviors of a program and prove mathematically they comply with the code’s specification, a machine-readable description of the developer’s intentions. Once the code has been verified against its specification mechanically, based on trust in the software used to check proofs, a skeptical auditor need only study the specifications and the theorem statements proven without needing to consult the much larger programs and proofs.
“Most software built today gets tested before it is released—at least one hopes it does!” says Swamy. “But even the most rigorous testing can only find bugs; it cannot rule out the existence of errors. For certain kinds of software, say security-critical code like TLS, one may actually want to prove that no vulnerabilities exist. Software verification is time-consuming and requires expertise, but, unlike testing, it can actually guarantee mathematically the absence of entire classes of errors.”
For Everest programs, the team’s specifications cover a range of properties, including:
- Memory safety: A program never violates the memory abstractions, and, as a consequence, is free from common bugs and vulnerabilities like buffer overflows, null-pointer dereferences, use-after-frees, and double-frees.
- Type safety: A program respects the interfaces among its components, including any abstraction boundaries. For example, one component never passes the wrong kind of parameters to another or accesses its private state.
- Functional correctness: A program’s input/output behavior is fully characterized by a simpler mathematical function, which acts as its functional specification.
- Side-channel resistance: Observations about the implementation’s low-level behavior, such as the time it takes to execute or the memory addresses it accesses, are independent of the secrets manipulated by the program. Hence, an adversary monitoring these “side-channels” learns nothing about the secrets.
- Cryptographic security: Based on cryptographic assumptions, except for negligible probability, Everest programs are indistinguishable from ideal cryptographic functionalities, the mathematical definitions that cryptographers use to capture the notion of secrecy, integrity, and secure communication.
Formal verification can play a role throughout the software development process, from design to implementations and deployments. The value of verification is increasingly widely perceived, especially for security-critical code.
“Cryptographic protocols are notoriously hard to implement correctly, with errors in both the algorithms and the protocol implementation itself being common,” says Eric Rescorla, chief technology officer of Mozilla Firefox, security area director at the Internet Engineering Task Force, and the editor of the TLS standard. “Formal verification tools like those developed by the Project Everest team have transformed the way we design these protocols, allowing us to move faster while having much higher confidence in protocol correctness.”
Indeed, assisted in part by their verification efforts, Benjamin Beurdouche (opens in new tab), Karthik Bhargavan (opens in new tab), Antoine Delignat-Lavaud (opens in new tab), and Cédric Fournet (opens in new tab), all members of Project Everest, have contributed features and fixes to the TLS standard. Beyond assisting with designs and specifications, verified implementations are also increasingly attractive for mainstream deployment.
“Verified implementations of cryptographic primitives are gradually making their way into major implementations,” Rescorla adds. “The Curve25519 and ChaCha/Poly1305 algorithms from HACL* are already running in Firefox, and I look forward to the day when we can adopt completely verified implementations.”
F*, Low*, and Vale
All Everest code is programmed and verified using F* (opens in new tab), a framework that brings together three strands of research in programming languages.
- F* is an effectful, general-purpose higher-order programming language in the tradition of languages like F# (opens in new tab), OCaml (opens in new tab), and Haskell (opens in new tab), among others.
- F* includes a full-fledged dependent type theory and tactic framework in the tradition of proof assistants like Coq (opens in new tab) and Nuprl (opens in new tab), allowing nearly arbitrary expressive power for conducting formal mathematical proofs.
- Like other program verifiers, including Dafny (opens in new tab) and Why3 (opens in new tab), F* is integrated with an automated theorem prover, Z3 (opens in new tab), which can automate many of the tedious, low-level proof steps necessary to prove programs correct.
“Starting from a language in my dissertation called Fable and a language developed at Microsoft Research Cambridge called F7 (opens in new tab), F* has evolved repeatedly over the course of almost a decade and is now developed by a large but closely knit team of people and has become a full-fledged proof assistant,” Swamy says.
“F* is unique in its use of both automated and interactive proofs, and its primitive support for effects makes it well-suited for verifying real-world software that is inherently effectful,” adds Aseem Rastogi (opens in new tab), a researcher at Microsoft Research India and member of the Project Everest team.
The following simple F* program, a functionally correct implementation of Quicksort on lists, is an example of how it operates. Given a list and total order on its elements, the type of quicksort on the first line asserts that the function always returns a sorted permutation of the original list.
[] | pivot::tl -> let hi, lo = partition (f pivot) tl in quicksort lo f @ pivot :: quicksort hi f" width="1024" height="227" />
While proving purely functional code, such as quicksort above, is relatively straightforward, to achieve high performance, Everest code is also programmed in two domain-specific languages embedded in F*: Low* and Vale.
Verifying efficient low-level code in F*
Low* (opens in new tab) is a subset of F* geared toward low-level programming with explicit memory management. Low* programs are extracted to idiomatic C code by a tool called KReMLin (opens in new tab) and run without garbage collection, reference counting, or any other automated memory management strategy. The HACL* library and Everest’s verified implementation of the TLS-1.3 record layer (opens in new tab) are programmed and verified in Low*.
For various cryptographic primitives, peak performance can only be obtained by going still lower level and programming in assembly language, often taking advantage of specialized hardware instructions, such as Intel AES-NI (opens in new tab). Vale (opens in new tab) stands for “Verified Assembly Language for Everest” and provides a domain-specific language for writing and verifying assembly programs targeting different platforms, like Windows, MacOS, and Linux, and architectures, like x86, x64, and ARM. It also supports multiple verification systems for carrying out proofs, including F* and Dafny.
The 2019 Symposium on Principles of Programming Languages (POPL) (opens in new tab) features a paper on Vale (opens in new tab). “A Verified, Efficient Embedding of a Verifiable Assembly Language” describes how the Everest team embedded Vale in F*, making use of F*’s dependent type system and its computational capabilities to implement an efficient verification condition generator for embedded assembly programs. Using Vale in F*, the Everest team reports on the first provably correct implementation of AES-GCM (opens in new tab), a cryptographic routine used by 90 percent of secure internet traffic.
What’s next?
In addition to the availability of Everest components on GitHub (opens in new tab), the team is planning the first integrated release of Everest’s verified stack, its libraries, and its verification tools for early summer 2019.
“Working at the intersection of systems, networking, cryptography, programming languages, and program verification, Everest is unique in that it co-develops verified software with the tools and techniques needed to build it,” says Swamy. “But we’re also looking beyond our specific goals for TLS to apply our verification technology to other areas.”
For instance, the Everest team is also working on verifying implementations of post-quantum cryptography (opens in new tab) and proving the security of key components used in high-assurance enterprise blockchains (opens in new tab).
Spanning four continents and 12 time zones, the Everest team works around the clock trying to build a more secure internet. There’s still a long way to go to the summit, but halfway through the project, the team has learned a lot and has put down footholds to make it easier to build provably secure software at scale.