首页> 外文期刊>ACM transactions on knowledge discovery from data >Data Sharing via Differentially Private Coupled Matrix Factorization
【24h】

Data Sharing via Differentially Private Coupled Matrix Factorization

机译:通过差分私有耦合矩阵分解的数据共享

获取原文
获取原文并翻译 | 示例
           

摘要

We address the privacy-preserving data-sharing problem in a distributed multiparty setting. In this setting, each data site owns a distinct part of a dataset and the aim is to estimate the parameters of a statistical model conditioned on the complete data without any site revealing any information about the individuals in their own parts. The sites want to maximize the utility of the collective data analysis while providing privacy guarantees for their own portion of the data as well as for each participating individual. Our first contribution is to classify these different privacy requirements as (i) site-level and (ⅱ) user-level differential privacy and present formal privacy guarantees for these two cases under the model of differential privacy. To satisfy a stronger form of differential privacy, we use a variant of differential privacy which is local differential privacy where the sensitive data is perturbed with a randomized response mechanism prior to the estimation. In this study, we assume that the data instances that are partitioned between several parties are arranged as matrices. A natural statistical model for this distributed scenario is coupled matrix factorization. We present two generic frameworks for privatizing Bayesian inference for coupled matrix factorization models that are able to guarantee proposed differential privacy notions based on the privacy requirements of the model. To privatize Bayesian inference, we first exploit the connection between differential privacy and sampling from a Bayesian posterior via stochastic gradient Langevin dynamics and then derive an efficient coupled matrix factorization method. In the local privacy context, we propose two models that have an additional privatization mechanism to achieve a stronger measure of privacy and introduce a Gibbs sampling based algorithm. We demonstrate that the proposed methods are able to provide good prediction accuracy on synthetic and real datasets while adhering to the introduced privacy constraints.
机译:我们在分布式多党设置中解决了隐私保留数据共享问题。在此设置中,每个数据站点拥有数据集的不同部分,目的是估计在完整数据上的统计模型的参数,而无需任何网站在其自己的部分中显示有关个人的任何信息。这些网站希望最大化集体数据分析的效用,同时为其自己的部分提供隐私保障,以及每个参与的个人。我们的第一个贡献是将这些不同的隐私要求分类为(i)网站级别和(Ⅱ)用户级差异隐私,并在差异隐私模式下为这两种情况提供正式隐私保障。为了满足更强大的差异隐私形式,我们使用差分隐私的变体,这是局部差异隐私,其中敏感数据在估计之前用随机响应机制扰乱。在本研究中,我们假设在多个方之间分区的数据实例被安排为矩阵。这种分布式场景的自然统计模型是耦合矩阵分子。我们为私有化贝叶斯因素推断提供了两个通用框架,以便能够根据模型的隐私要求保证提出的差异隐私概念。为了私有化贝叶斯推理,我们首先利用差异隐私和从贝叶斯后续进行采样之间的连接,通过随机梯度Langevin动态,然后导出有效的耦合矩阵分解方法。在本地隐私背景下,我们提出了两个具有额外私有化机制的模型,以实现更强大的隐私权并引入基于GIBBS采样的算法。我们证明,该方法能够在综合和实际数据集中提供良好的预测准确性,同时遵守介绍的隐私约束。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号