Hardening Attack Surfaces with Formally Proven Binary Format Parsers
- Nikhil Swamy ,
- Tahina Ramananandro ,
- Aseem Rastogi ,
- Irina Spiridonova ,
- Haobin Ni ,
- Dmitry Malloy ,
- Juan Vazquez ,
- Michael Tang ,
- Omar Cardona ,
- Arti Gupta
2022 Programming Language Design and Implementation |
With an eye toward performance, interoperability, or legacy concerns, low-level system software often must parse binary encoded data formats. Few tools are available for this task, especially since the formats involve a mixture of arithmetic and data dependence, beyond what can be handled by typical parser generators. As such, parsers are written by hand in languages like C, with inevitable errors leading to security vulnerabilities.
Addressing this need, we present EverParse3D, a parser generator for binary message formats that yields performant C code backed by fully automated formal proofs of memory safety, arithmetic safety, functional correctness, and even double-fetch freedom to prevent certain kinds of time-of check/time-of-use errors. This allows systems developers to specify their message formats declaratively and to integrate correct-by-construction C code into their applications, eliminating several classes of bugs.
EverParse3D has been in use in the Windows kernel for the past year. Applied primarily to the Hyper-V network virtualization stack, the formats of nearly 100 different messages spanning four protocols have been specified in EverParse3D and the resulting formally proven parsers have replaced prior handwritten code. We report on our experience in detail