【24h】

Electronic Medical Records privacy preservation through k-anonymity clustering method

机译:通过k-匿名聚类方法保护电子病历的隐私

获取原文

摘要

Electronic Medical Records (EMRs) enable the sharing of patient medical data whenever it is needed and also are used as a tool for building new medical technology and patient recommendation systems. Since EMRs include patients' private data, access is restricted to researchers. Thus, an anonymizing technique is necessary that keeps patients' private data safe while not damaging useful medical information. k-member clustering anonymization approaches k-anonymization as a clustering issue. The objective of the k-member clustering problem is to gather records that will minimize the data distortion during data generalization. Most of the previous clustering techniques include random seed selection. However, randomly selecting a cluster seed will provide inconsistent performance. The authors propose a k-member cluster seed selection algorithm (KMCSSA) that is distinct from the previous clustering methods. Instead of randomly selecting a cluster seed, the proposed method selects the seed based on the closeness centrality to provide consistent information loss (IL) and to reduce the information distortion. An adult database from University of California Irvine Machine Learning Repository was used for the experiment. By comparing the proposed method with two previous methods, the experimental results shows that KMCSSA provides superior performance with respect to information loss. The authors provide a privacy protection algorithm that derives consistent information loss and reduces the overall information distortion.
机译:电子病历(EMR)可以在需要时共享病人的医疗数据,也可以用作构建新的医疗技术和病人推荐系统的工具。由于EMR包含患者的私人数据,因此访问仅限于研究人员。因此,必须使用匿名技术来确保患者的私人数据安全,同时又不损害有用的医学信息。 k成员聚类匿名化将k匿名化作为聚类问题。 k成员聚类问题的目的是收集记录,以在数据泛化期间将数据失真降至最低。大多数以前的聚类技术都包括随机种子选择。但是,随机选择群集种子将提供不一致的性能。作者提出了一种与以前的聚类方法不同的k成员聚类种子选择算法(KMCSSA)。代替随机选择簇种子,所提出的方法基于接近中心性来选择种子,以提供一致的信息丢失(IL)并减少信息失真。实验使用了来自加利福尼亚大学尔湾分校机器学习存储库的成人数据库。通过将所提出的方法与先前的两种方法进行比较,实验结果表明,KMCSSA在信息丢失方面提供了卓越的性能。作者提供了一种隐私保护算法,该算法可得出一致的信息丢失并减少总体信息失真。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号