XRay: Increasing the Web’s Transparency with Differential Correlation
- Mathias Lecuyer ,
- Guillaume Ducoffe ,
- Francis Lan ,
- Andrei Papancea ,
- Theofilos Petsios ,
- Riley Spahn ,
- Augustin Chaintreau ,
- Roxana Geambasu
USENIX Security'14 |
Organized by ACSA
Today’s Web services—such as Google, Amazon, and Facebook—leverage user data for varied purposes, including personalizing recommendations, targeting advertisements, and adjusting prices. At present, users have little insight into how their data is being used. Hence, they cannot make informed choices about the services they choose. To increase transparency, we developed XRay, the first fine-grained, robust, and scalable personal data tracking system for the Web. XRay predicts which data in an arbitrary Web account (such as emails, searches, or viewed products) is being used to target which outputs (such as ads, recommended products, or prices). XRay’s core functions are service agnostic and easy to instantiate for new services, and they can track data within and across services. To make predictions independent of the audited service, XRay relies on the following insight: by comparing outputs from different accounts with similar, but not identical, subsets of data, one can pinpoint targeting through correlation. We show both theoretically, and through experiments on Gmail, Amazon, and YouTube, that XRay achieves high precision and recall by correlating data from a surprisingly small number of extra accounts.