首页> 外文会议>International conference on healthcare science and engineering >An Improved Data Anonymization Algorithm for Incomplete Medical Dataset Publishing
【24h】

An Improved Data Anonymization Algorithm for Incomplete Medical Dataset Publishing

机译:一种改进的数据匿名算法,用于不完整的医疗数据集发布

获取原文

摘要

To protect sensitive information of patients and prevent privacy leakage, it is necessary to deal with data anonymously in medical dataset publishing. Most of the existing anonymity protection technologies discard the records with missing data, and it will cause large differences in characteristics in data anonymization, resulting in severe information loss. To solve this problem, we propose a novel data anonymization algorithm for incomplete medical dataset based on L-diversity algorithm (DAIMDL) in this work. In the premise of preserving records with missing data, DAIMDL clusters data on the basis of the improved k-member algorithm, and uses the information entropy generated by data generalization to calculate the distance in clustering stage. Then, the data groups obtained by clustering are generalized. The experimental results show that it can protect the sensitive attributes of patients better, reduce the information loss during the anonymization process of missing data, and improve the availability of the dataset.
机译:为了保护患者的敏感信息并防止隐私泄漏,有必要在医疗数据集发布中匿名处理数据。大多数现有的匿名保护技术丢弃了缺失数据的记录,它将导致数据匿名化的特性差异,导致严重的信息丢失。为了解决这个问题,我们提出了一种基于这项工作中的L-分集算法(DIMAMDL)的不完整医疗数据集的新数据匿名算法。在保留具有缺失数据的记录的前提下,DIMADL群集基于改进的K-Member算法,并使用数据概念生成的信息熵计算聚类阶段中的距离。然后,通过聚类获得的数据组是概括的。实验结果表明,它可以更好地保护患者的敏感属性,减少缺名数据的信息丢失过程,并改善数据集的可用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号