Boosting is a general method for improving the accuracy of learning algorithms. We use boosting to construct improved {em privacy-preserving synopses} of an input database. These are data structures that yield, for a given set $Q$ of queries over an input database, reasonably accurate estimates of the responses to every query in~$Q$, even when the number of queries is much larger than the number of rows in the database. Given a {em base synopsis generator} that takes a distribution on $Q$ and produces a ``weak'' synopsis that yields ``good'' answers for a majority of the weight in $Q$, our {em Boosting for Queries} algorithm obtains a synopsis that is good for all of~$Q$. We ensure privacy for the rows of the database, but the boosting is performed on the {em queries}. We also provide the first synopsis generators for arbitrary sets of arbitrary low-sensitivity queries, {it i.e.}, queries whose answers do not vary much under the addition or deletion of a single row. In the execution of our algorithm certain tasks, each incurring some privacy loss, are performed many times. To analyze the cumulative privacy loss, we obtain an $O(eps^2)$ bound on the {em expected} privacy loss from a single $eps$-dfp{} mechanism. Combining this with evolution of confidence arguments from the literature, we get stronger bounds on the expected cumulative privacy loss due to multiple mechanisms, each of which provides $eps$-differential privacy or one of its relaxations, and each of which operates on (potentially) different, adaptively chosen, databases.
展开▼