首页> 外文会议>ACM SIGKDD international conference on knowledge discovery and data mining;KDD 10 >Discovering Frequent Patterns in Sensitive Data

【24h】

Discovering Frequent Patterns in Sensitive Data

机译：发现敏感数据中的频繁模式

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Discovering frequent patterns from data is a popular exploratory technique in data mining. However, if the data are sensitive (e.g., patient health records, user behavior records) releasing information about significant patterns or trends carries significant risk to privacy. This paper shows how one can accurately discover and release the most significant patterns along with their frequencies in a data set containing sensitive information, while providing rigorous guarantees of privacy for the individuals whose information is stored there.We present two efficient algorithms for discovering the k most frequent patterns in a data set of sensitive records. Our algorithms satisfy differential privacy, a recently introduced definition that provides meaningful privacy guarantees in the presence of arbitrary external information. Differentially private algorithms require a degree of uncertainty in their output to preserve privacy. Our algorithms handle this by returning 'noisy' lists of patterns that are close to the actual list of k most frequent patterns in the data. We define a new notion of utility that quantifies the output accuracy of private top-k pattern mining algorithms. In typical data sets, our utility criterion implies low false positive and false negative rates in the reported lists. We prove that our methods meet the new utility criterion; we also demonstrate the performance of our algorithms through extensive experiments on the transaction data sets from the FIMI repository. While the paper focuses on frequent pattern mining, the techniques developed here are relevant whenever the data mining output is a list of elements ordered according to an appropriately 'robust' measure of interest.

机译：从数据中发现频繁的模式是数据挖掘中一种流行的探索性技术。但是，如果数据是敏感的（例如，患者健康记录，用户行为记录），则发布有关重要模式或趋势的信息会给隐私带来重大风险。本文展示了如何在包含敏感信息的数据集中准确发现和释放最重要的模式及其频率，同时为存储信息的个人提供严格的隐私保护。我们提出了两种有效的算法，用于发现敏感记录数据集中的k个最频繁的模式。我们的算法满足差分隐私，这是最近引入的定义，可以在存在任意外部信息的情况下提供有意义的隐私保证。差分私有算法要求其输出具有一定程度的不确定性以保护隐私。我们的算法通过返回“嘈杂”的模式列表来处理此问题，该列表与数据中k个最频繁的模式的实际列表接近。我们定义了一种实用性的新概念，该概念可量化私有top-k模式挖掘算法的输出精度。在典型数据集中，我们的效用标准意味着报告列表中的假阳性率和假阴性率均较低。我们证明我们的方法符合新的效用标准。我们还通过对FIMI存储库中的交易数据集进行了广泛的实验，证明了我们算法的性能。尽管本文关注的是频繁模式挖掘，但只要数据挖掘输出是根据适当的“稳健”度量标准排序的元素列表，此处开发的技术就很重要。

著录项

来源
《ACM SIGKDD international conference on knowledge discovery and data mining;KDD 10 》|2011年|p.503-512|共10页
会议地点
作者
Raghav Bhaskar; Srivatsan Laxman; Adam Smith; Abhradeep Thakurta;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311.13;
关键词
Differential privacy; frequent itemsets; frequent patterns; exponential mechanism; privacy;

机译：差异性隐私;频繁的项目集;频繁的模式;指数机制隐私;

相似文献

外文文献
中文文献
专利

1. Discovering Frequent Patterns by Constructing Frequent Pattern Network over Data Streams in E-Marketplaces [J] . Oh Kyeong-Jin, Jung Jin-Guk, Jo Geun-Sik Wireless personal communications: An Internaional Journal . 2014 ,第4期

机译：通过在电子市场中的数据流上构建频繁模式网络来发现频繁模式
2. Discovering partial periodic-frequent patterns in a transactional database [J] . R. Uday Kiran, J.N. Venkatesh, Masashi Toyoda, The Journal of Systems and Software . 2017 ,第Mara期

机译：在事务数据库中发现部分周期性模式
3. Discovering Productive Periodic Frequent Patterns in Transactional Databases [J] . Vincent Mwintieru Nofong Annals Data Science . 2016 ,第3期

机译：在事务数据库中发现生产性周期性频繁模式
4. Discovering Frequent Patterns in Sensitive Data [C] . Raghav Bhaskar, Srivatsan Laxman, Adam Smith, ACM SIGKDD international conference on knowledge discovery and data mining . 2010

机译：发现敏感数据的频繁模式
5. New techniques for efficiently discovering frequent patterns. [D] . Jin, Ruoming. 2005

机译：有效发现频繁模式的新技术。
6. Using data mining and OLAP to discover patterns in a database of patients with Y-chromosome deletions. [O] . S. Dzeroski, D. Hristovski, B. Peterlin 2000

机译：使用数据挖掘和OLAP在Y染色体缺失患者数据库中发现模式。
7. Discovering frequent patterns in sensitive data [O] . Raghav Bhaskar, Srivatsan Laxman, Abhradeep Thakurta, 2011

机译：发现敏感数据中的频繁模式

Discovering Frequent Patterns in Sensitive Data

摘要

著录项

相似文献

相关主题

期刊订阅