首页> 美国卫生研究院文献>other >Differentially Private Distributed Online Learning
【2h】

Differentially Private Distributed Online Learning

机译:差分私人分布式在线学习

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In the big data era, the generation of data presents some new characteristics, including wide distribution, high velocity, high dimensionality, and privacy concern. To address these challenges for big data analytics, we develop a privacy-preserving distributed online learning framework on the data collected from distributed data sources. Specifically, each node (i.e., data source) has the capacity of learning a model from its local dataset, and exchanges intermediate parameters with a random part of their own neighboring (logically connected) nodes. Hence, the topology of the communications in our distributed computing framework is unfixed in practice. As online learning always performs on the sensitive data, we introduce the notion of differential privacy (DP) into our distributed online learning algorithm (DOLA) to protect the data privacy during the learning, which prevents an adversary from inferring any significant sensitive information. Our model is of general value for big data analytics in the distributed setting, because it can provide rigorous and scalable privacy proof and have much less computational complexity when compared to classic schemes, e.g., secure multiparty computation (SMC). To tackle high-dimensional incoming data entries, we study a sparse version of the DOLA with novel DP techniques to save the computing resources and improve the utility. Furthermore, we present two modified private DOLAs to meet the need of practical applications. One is to convert the DOLA to distributed stochastic optimization in an offline setting, the other is to use the mini-batches approach to reduce the amount of the perturbation noise and improve the utility. We conduct experiments on real datasets in a configured distributed platform. Numerical experiment results validate the feasibility of our private DOLAs.
机译:在大数据时代,数据的生成呈现出一些新的特征,包括分布广泛,速度快,维数大和关注隐私。为了应对大数据分析的这些挑战,我们针对从分布式数据源收集的数据开发了一个保护隐私的分布式在线学习框架。具体地,每个节点(即,数据源)具有从其本地数据集中学习模型的能力,并与它们自己的相邻(逻辑连接)节点的随机部分交换中间参数。因此,在我们的分布式计算框架中,通信的拓扑实际上是固定的。由于在线学习总是对敏感数据执行操作,因此我们在我们的分布式在线学习算法(DOLA)中引入了差异隐私(DP)的概念,以在学习过程中保护数据隐私,从而防止了对手推断任何重要的敏感信息。我们的模型对于分布式环境中的大数据分析具有普遍价值,因为与经典方案(例如安全多方计算(SMC))相比,它可以提供严格且可扩展的隐私证明,并且计算复杂度要低得多。为了解决高维输入数据输入问题,我们使用新颖的DP技术研究了DOLA的稀疏版本,以节省计算资源并提高实用性。此外,我们提出了两种修改后的私有DOLA,以满足实际应用的需求。一种是在离线环境中将DOLA转换为分布式随机优化,另一种是使用小批量方法来减少扰动噪声量并提高实用性。我们在配置的分布式平台上对真实数据集进行实验。数值实验结果验证了我们私人DOLA的可行性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号