首页> 外文会议>2017 IEEE Symposium on Privacy-Aware Computing >Distributed Cardinality Estimation of Set Operations with Differential Privacy
【24h】

Distributed Cardinality Estimation of Set Operations with Differential Privacy

机译:具有差分隐私的集合操作的分布式基数估计

获取原文
获取原文并翻译 | 示例

摘要

In this paper we study the problem of estimating the cardinality of pairwise set operations (union and intersection) over sets possessed by different data owners, while preserving differential privacy. In our problem setting, a data owner could only communicate with an untrusted server, and thus have to perturb its set data for privacy protection before sharing them with the server. This problem setting is relevant to diverse applications in practice, including sensor-based traffic monitoring, cross-domain data integration, and combining findings from multiple surveys. To tackle this problem, we first adopt existing randomized response technique to perturb the bit vector (to achieve differential privacy) and develop tools which the server can use to derive the cardinality of set operations from the randomized bit vectors. However, the variance of the union/intersection estimator grows linearly with the universe (bit-vector) size which is impractical for large universes. To keep the variance low we in addition propose to resort to Bloom filters instead of high-dimensional bit vectors to share set data with the server. The key insight is that in spite of inevitable collisions in BF by keeping its size small we can bound the variance of the union/intersection cardinality estimators. Finally, we show that investing a small part of the privacy budget into reporting (obfuscated) set cardinality can further reduce the estimator errors for up to 20%. Our empirical analysis reveals the impact of various parameters including privacy budget and Bloom filter size on the overall accuracy of the approach and demonstrates the utility of the proposed solution.
机译:在本文中,我们研究了在保留差异隐私的同时,估计不同数据所有者拥有的集合上的成对集合操作(​​联合和交集)的基数的问题。在我们的问题设置中,数据所有者只能与不受信任的服务器通信,因此在与服务器共享数据之前,必须先扰动其设置的数据以保护隐私。此问题的设置与实践中的各种应用程序相关,包括基于传感器的流量监控,跨域数据集成以及合并来自多个调查的结果。为了解决这个问题,我们首先采用现有的随机响应技术来扰动位向量(以实现差分隐私),并开发工具,服务器可以使用该工具从随机位向量中导出集合操作的基数。但是,并集/相交估计量的方差随宇宙(位向量)的大小线性增长,这对于大宇宙是不切实际的。为了使方差保持较低,我们还建议使用Bloom过滤器代替高维位向量来与服务器共享设置数据。关键的见解是,尽管在BF中不可避免地发生碰撞,但通过保持其大小较小,我们可以限制联合/交叉点基数估计量的方差。最终,我们表明,将隐私预算的一小部分投入到报告(模糊的)集合基数上,可以进一步将估计器错误减少多达20%。我们的经验分析揭示了各种参数(包括隐私预算和布隆过滤器大小)对方法总体精度的影响,并证明了所提出解决方案的实用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号