Structure Preserving Anonymization of Router Configuration Data
- Dave Maltz ,
- Jibin Zhan ,
- Gısli Hjalmt´ ysson ,
- Albert Greenberg ,
- Jennifer Rexford ,
- Geoffrey G. Xie ,
- Hui Zhang
IMC '04 Proceedings of the 4th ACM SIGCOMM conference on Internet measurement |
Published by ACM SIGCOMM
A repository of router configuration files from production networks would provide the research community with a treasure trove of data about network topologies, routing designs, and security policies. However, configuration files have been largely unobtainable precisely because they provide detailed information that could be exploited by competitors and attackers. This paper describes a method for anonymizing router configuration files by removing all information that connects the data to the identity of the originating network, while still preserving the structure of information that makes the data valuable to networking researchers. Anonymizing configuration files has unusual requirements, including preserving relationships between elements of data, anonymizing regular expressions, and robustly coping with more than 200 versions of the configuration language. Conventional tools and techniques are poorly suited to the problem. Our anonymization method has been validated with a major carrier, earning unprivileged researchers access to the configuration files of thousands of routers in hundreds of networks. Through example analysis, we demonstrate that the anonymized data retains the key properties of the network design. The paper sets out techniques that could be used in an attempt to break the anonymization and concludes the method is most applicable to enterprise networks. When applied to backbone networks, which are few in number and many of whose properties can be publicly measured, the anonymization might be broken by fingerprinting techniques described in this paper.