Improving Parameter Estimation and Defensive Ability of Latent Dirichlet Allocation Model Training Under Rényi Differential Privacy

黄涛; 赵素云; 陈红; 刘艺璇

首页> 中文期刊> 《计算机科学技术学报：英文版》 >Improving Parameter Estimation and Defensive Ability of Latent Dirichlet Allocation Model Training Under Rényi Differential Privacy

Improving Parameter Estimation and Defensive Ability of Latent Dirichlet Allocation Model Training Under Rényi Differential Privacy

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Latent Dirichlet allocation(LDA)is a topic model widely used for discovering hidden semantics in massive text corpora.Collapsed Gibbs sampling(CGS),as a widely-used algorithm for learning the parameters of LDA,has the risk of privacy leakage.Specifically,word count statistics and updates of latent topics in CGS,which are essential for parameter estimation,could be employed by adversaries to conduct effective membership inference attacks(MIAs).Till now,there are two kinds of methods exploited in CGS to defend against MIAs:adding noise to word count statistics and utilizing inherent privacy.These two kinds of methods have their respective limitations.Noise sampled from the Laplacian distribution sometimes produces negative word count statistics,which render terrible parameter estimation in CGS.Utilizing inherent privacy could only provide weak guaranteed privacy when defending against MIAs.It is promising to propose an effective framework to obtain accurate parameter estimations with guaranteed differential privacy.The key issue of obtaining accurate parameter estimations when introducing differential privacy in CGS is making good use of the privacy budget such that a precise noise scale is derived.It is the first time that R′enyi differential privacy(RDP)has been introduced into CGS and we propose RDP-LDA,an effective framework for analyzing the privacy loss of any differentially private CGS.RDP-LDA could be used to derive a tighter upper bound of privacy loss than the overestimated results of existing differentially private CGS obtained byε-DP.In RDP-LDA,we propose a novel truncated-Gaussian mechanism that keeps word count statistics non-negative.And we propose distribution perturbation which could provide more rigorous guaranteed privacy than utilizing inherent privacy.Experiments validate that our proposed methods produce more accurate parameter estimation under the JS-divergence metric and obtain lower precision and recall when defending against MIAs.

著录项

来源
《计算机科学技术学报：英文版》 |2022年第6期|1382-1397|共16页
作者
黄涛; 赵素云; 陈红; 刘艺璇;
展开▼
作者单位

Key Laboratory of Data Engineering and Knowledge Engineering(Renmin University of China);

Ministry of Education Beijing 100087;

China;

School of Information;

Renmin University of China;

Beijing 100087;

China;

展开▼
原文格式 PDF
正文语种 chi
中图分类计算技术、计算机技术;
关键词
latent Dirichlet allocation; parameter estimation; membership inference attack; Rényi differential privacy;

相似文献

中文文献
外文文献

Improving Parameter Estimation and Defensive Ability of Latent Dirichlet Allocation Model Training Under Rényi Differential Privacy

摘要

著录项

相似文献

相关主题

期刊订阅