首页> 外文期刊>JMLR: Workshop and Conference Proceedings >The Price of Selection in Differential Privacy
【24h】

The Price of Selection in Differential Privacy

机译:差异隐私中的选择价格

获取原文
           

摘要

In the differentially private top-$k$ selection problem, we are given a dataset $X ∈pmo^n imes d$, in which each row belongs to an individual and each column corresponds to some binary attribute, and our goal is to find a set of $k ?d$ columns whose means are approximately as large as possible. Differential privacy requires that our choice of these $k$ columns does not depend too much on any on individual’s dataset. This problem can be solved using the well known exponential mechanism and composition properties of differential privacy. In the high-accuracy regime, where we require the error of the selection procedure to be to be smaller than the so-called sampling error $α≈sqrtln(d)$, this procedure succeeds given a dataset of size $n ?k ln(d)$. We prove a matching lower bound, showing that a dataset of size $n ?k ln(d)$ is necessary for private top-$k$ selection in this high-accuracy regime. Our lower bound shows that selecting the $k$ largest columns requires more data than simply estimating the value of those $k$ columns, which can be done using a dataset of size just $n ?k$.
机译:在差分私有的top- $ k $选择问题中,我们得到了一个数据集$ X∈ pmo ^ n times d $,其中每一行属于一个个体,每一列对应于一个二进制属性,我们的目标是查找一组$ k?d $列,其均值应尽可能大。差异隐私要求我们对这些$ k $列的选择不取决于个人数据集。使用众所周知的指数机制和差分隐私的合成属性可以解决此问题。在高精度体制中,我们要求选择过程的误差要小于所谓的采样误差$α≈ sqrt ln(d)/ n $,该过程在给定大小数据集的情况下成功$ n?k ln(d)$。我们证明了一个匹配的下界,表明在这种高精度体制中,对于私有顶部$ k $的选择,需要大小为$ n?k ln(d)$的数据集。我们的下限显示,选择$ k $个最大的列比简单地估计这些$ k $列的值需要更多的数据,这可以使用大小仅为$ n?k $的数据集来完成。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号