首页> 外文期刊>Knowledge-Based Systems >Hierarchical anonymization algorithms against background knowledge attack in data releasing
【24h】

Hierarchical anonymization algorithms against background knowledge attack in data releasing

机译:数据发布中针对背景知识攻击的分层匿名算法

获取原文
获取原文并翻译 | 示例

摘要

Preserving privacy in the presence of adversary's background knowledge is very important in data publishing. The k-anonymity model, while protecting identity, does not protect against attribute disclosure. One of strong refinements of k-anonymity, beta-likeness, does not protect against identity disclosure. Neither model protects against attacks featured by background knowledge. This research proposes two approaches for generating k-anonymous beta-likeness datasets that protect against identity and attribute disclosures and prevent attacks featured by any data correlations between QIs and sensitive attribute values as the adversary's background knowledge. In particular, two hierarchical anonymization algorithms are proposed. Both algorithms apply agglomerative clustering techniques in their first stage in order to generate clusters of records whose probability distributions extracted by background knowledge are similar. In the next phase, k-anonymity and beta-likeness are enforced in order to prevent identity and attribute disclosures. Our extensive experiments demonstrate that the proposed algorithms outperform other state-of-the-art anonymization algorithms in terms of privacy and data utility where the number of unpublished records in our algorithms is less than that of the others. As well-known information loss metrics fail to Measure precisely the imposed data inaccuracies stemmed from the removal of records that cannot be published in any equivalence class. This research also introduces an extension into the Global Certainty Penalty metric that considers unpublished records. (C) 2016 Elsevier B.V. All rights reserved.
机译:在对手具有背景知识的情况下保护隐私在数据发布中非常重要。 k-匿名模型在保护身份的同时并不能防止属性泄露。 k匿名性的一种强大改进(类似于β)不能防止身份泄露。两种模型都无法防止以背景知识为特征的攻击。这项研究提出了两种方法来生成k个匿名的β相似性数据集,这些数据集可防止身份和属性泄露,并防止以QI和敏感属性值之间的任何数据相关性为攻击者的背景知识来进行攻击。特别地,提出了两种分层匿名化算法。两种算法在其第一阶段都应用了聚类聚类技术,以生成记录的聚类,这些聚类的背景知识提取的概率分布相似。在下一阶段,将强制执行k-匿名和beta-likeness,以防止身份和属性泄露。我们广泛的实验表明,在隐私和数据实用性方面,所提出的算法优于其他最新的匿名化算法,在这些算法中,我们算法中未发布记录的数量少于其他算法。由于众所周知的信息丢失指标无法准确地衡量由于删除了在任何等效类中无法发布的记录而导致的数据不准确。这项研究还引入了对“全球确定性惩罚”度量标准的扩展,该度量标准考虑了未发布的记录。 (C)2016 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号