【24h】

A Condensation Approach to Privacy Preserving Data Mining

机译:一种浓缩隐私保护数据挖掘的方法

获取原文
获取原文并翻译 | 示例

摘要

In recent years, privacy preserving data mining has become an important problem because of the large amount of personal data which is tracked by many business applications. In many cases, users are unwilling to provide personal information unless the privacy of sensitive information is guaranteed. In this paper, we propose a new framework for privacy preserving data mining of multi-dimensional data. Previous work for privacy preserving data mining uses a perturbation approach which reconstructs data distributions in order to perform the mining. Such an approach treats each dimension independently and therefore ignores the correlations between the different dimensions. In addition, it requires the development of a new distribution based algorithm for each data mining problem, since it does not use the multi-dimensional records, but uses aggregate distributions of the data as input. This leads to a fundamental re-design of data mining algorithms. In this paper, we will develop a new and flexible approach for privacy preserving data mining which does not require new problem-specific algorithms, since it maps the original data set into a new anonymized data set. This anonymized data closely matches the characteristics of the original data including the correlations among the different dimensions. We present empirical results illustrating the effectiveness of the method.
机译:近年来,由于许多业务应用程序跟踪的大量个人数据,保护隐私的数据挖掘已成为一个重要问题。在许多情况下,除非敏感信息的私密性得到保证,否则用户不愿提供个人信息。在本文中,我们提出了一个用于多维数据隐私保护数据挖掘的新框架。先前的隐私保护数据挖掘工作使用一种扰动方法,该方法可以重建数据分布以执行挖掘。这种方法独立地对待每个维度,因此忽略了不同维度之间的相关性。另外,由于它不使用多维记录,而是使用数据的聚合分布作为输入,因此需要针对每个数据挖掘问题开发一种基于分布的新算法。这导致对数据挖掘算法的根本重新设计。在本文中,我们将开发一种新的灵活的隐私保护数据挖掘方法,该方法不需要新的特定于问题的算法,因为它将原始数据集映射到新的匿名数据集。该匿名数据紧密匹配原始数据的特征,包括不同维度之间的相关性。我们提供的经验结果说明了该方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号