Differentially Private Aggregation of Distributed Time-Series with Transformation and Encryption
- Vibhor Rastogi ,
- Suman Nath
SIGMOD '10: Proceedings of the 2010 ACM SIGMOD international conference on Management of data |
Published by Association for Computing Machinery, Inc.
We propose PASTE, the first differentially private aggregation algorithms for distributed time-series data that offer good practical utility without any trusted server. PASTE addresses two important challenges in participatory data-mining applications where (i) individual users collect temporally correlated time-series data (such as location traces, web history, personal health data), and (ii) an untrusted third-party aggregator wishes to run aggregate queries on the data. To address this, PASTE incorporates two new algorithms. To ensure differential privacy for time-series data despite the presence of temporal correlation, PASTE uses the Fourier Perturbation Algorithm (FPAk). Standard differential privacy techniques perform poorly for time-series data. To answer n queries, such techniques can result in a noise of Θ(n) to each query answer, making the answers practically useless if n is large. Our FPAk algorithm perturbs the Discrete Fourier Transform of the query answers. For answering n queries, FPAk improves the expected error from Θ(n) to roughly Θ(k) where k is the number of Fourier coefficients that can (approximately) reconstruct all the n query answers. Our experiments show that k ≪ n for many real-life data-sets resulting in a huge error-improvement for FPAk. To deal with the absence of a trusted central server, PASTE uses the Distributed Laplace Perturbation Algorithm (DLPA) that adds noise in a distributed way in order to guarantee differential privacy. To the best of our knowledge, DLPA is the first distributed differentially private algorithm that can scale with a large number of users: DLPA outperforms the only other distributed solution for differential privacy proposed so far, by reducing the computational load per user from O(U) to O(1) where U is the number of users.
Copyright © 2007 by the Association for Computing Machinery, Inc. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or [email protected]. The definitive version of this paper can be found at ACM's Digital Library --http://www.acm.org/dl/.