首页> 外文期刊>Knowledge and Data Engineering, IEEE Transactions on >Efficient Multidimensional Suppression for K-Anonymity
【24h】

Efficient Multidimensional Suppression for K-Anonymity

机译:K匿名性的高效多维抑制

获取原文
获取原文并翻译 | 示例

摘要

Many applications that employ data mining techniques involve mining data that include private and sensitive information about the subjects. One way to enable effective data mining while preserving privacy is to anonymize the data set that includes private information about subjects before being released for data mining. One way to anonymize data set is to manipulate its content so that the records adhere to k-anonymity. Two common manipulation techniques used to achieve k-anonymity of a data set are generalization and suppression. Generalization refers to replacing a value with a less specific but semantically consistent value, while suppression refers to not releasing a value at all. Generalization is more commonly applied in this domain since suppression may dramatically reduce the quality of the data mining results if not properly used. However, generalization presents a major drawback as it requires a manually generated domain hierarchy taxonomy for every quasi-identifier in the data set on which k-anonymity has to be performed. In this paper, we propose a new method for achieving k-anonymity named K-anonymity of Classification Trees Using Suppression (kACTUS). In kACTUS, efficient multidimensional suppression is performed, i.e., values are suppressed only on certain records depending on other attribute values, without the need for manually produced domain hierarchy trees. Thus, in kACTUS, we identify attributes that have less influence on the classification of the data records and suppress them if needed in order to comply with k-anonymity. The kACTUS method was evaluated on 10 separate data sets to evaluate its accuracy as compared to other k-anonymity generalization- and suppression-based methods. Encouraging results suggest that kACTUS' predictive performance is better than that of existing k-anonymity algorithms. Specifically, on average, the accuracies of TDS, TDR, and kADET are lower than kACTUS in 3.5, 3.3, and 1.9 percent, respectively, despite their u-nsage of manually defined domain trees. The accuracy gap is increased to 5.3, 4.3, and 3.1 percent, respectively, when no domain trees are used.
机译:许多采用数据挖掘技术的应用程序都涉及挖掘数据,这些数据包括有关主题的私人和敏感信息。在保留隐私的同时实现有效数据挖掘的一种方法是在发布有关数据的数据之前匿名化包含有关主题的私人信息的数据集。匿名化数据集的一种方法是操纵其内容,以使记录遵循k匿名性。用于实现数据集的k-匿名性的两种常见操作技术是泛化和抑制。泛化指的是用不太明确但语义上一致的值替换值,而抑制指的是根本不释放值。归纳法更普遍地应用于此领域,因为如果使用不当,抑制可能会大大降低数据挖掘结果的质量。但是,归纳法存在一个主要缺点,因为它需要为数据集中每个必须执行k匿名性的准标识符手动生成域层次分类法。在本文中,我们提出了一种通过抑制(kACTUS)来实现分类树的K-匿名性的k-匿名性的新方法。在kACTUS中,将执行有效的多维抑制,即,仅根据某些记录来抑制值,具体取决于其他属性值,而无需手动生成的域层次结构树。因此,在kACTUS中,我们确定对数据记录的分类影响较小的属性,并在需要时抑制它们以符合k-匿名性。与其他基于k-匿名性泛化和基于抑制的方法相比,在10个单独的数据集上评估了kACTUS方法的准确性。令人鼓舞的结果表明,kACTUS的预测性能优于现有的k-匿名算法。具体而言,尽管它们使用手动定义的域树,但平均而言,TDS,TDR和kADET的准确度分别比kACTUS低3.5%,3.3和1.9%。当不使用域树时,精度差距分别增加到5.3%,4.3%和3.1%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号