An Improved Data Anonymization Algorithm for Incomplete Medical Dataset Publishing

机译：用于不完整医学数据集发布的改进的数据匿名化算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

To protect sensitive information of patients and prevent privacy leakage, it is necessary to deal with data anonymously in medical dataset publishing. Most of the existing anonymity protection technologies discard the records with missing data, and it will cause large differences in characteristics in data anonymization, resulting in severe information loss. To solve this problem, we propose a novel data anonymization algorithm for incomplete medical dataset based on L-diversity algorithm (DAIMDL) in this work. In the premise of preserving records with missing data, DAIMDL clusters data on the basis of the improved k-member algorithm, and uses the information entropy generated by data generalization to calculate the distance in clustering stage. Then, the data groups obtained by clustering are generalized. The experimental results show that it can protect the sensitive attributes of patients better, reduce the information loss during the anonymization process of missing data, and improve the availability of the dataset.

机译：为了保护患者的敏感信息并防止隐私泄露，有必要在医疗数据集发布中匿名处理数据。现有的大多数匿名保护技术都会丢弃缺少数据的记录，这将导致数据匿名化的特性差异很大，从而导致严重的信息丢失。为了解决这个问题，本文提出了一种基于L-多样性算法（DAIMDL）的不完整医学数据集数据匿名化算法。在保留丢失数据的记录的前提下，DAIMDL在改进的k成员算法的基础上对数据进行聚类，并使用数据概括生成的信息熵在聚类阶段计算距离。然后，归纳通过聚类获得的数据组。实验结果表明，它可以更好地保护患者的敏感属性，减少丢失数据匿名化过程中的信息丢失，提高数据集的可用性。

著录项

来源
《International conference on healthcare science and engineering》|2018年|115-128|共14页
会议地点 Guilin(CN)
作者
Wei Liu; Mengli Pei; Congcong Cheng; Wei She; Chase Q. Wu;
展开▼
作者单位

Software College of Zhengzhou University Zhengzhou 450000 Henan China Collaborative Innovation Center for Internet Healthcare Zhengzhou 450000 Henan China;

Collaborative Innovation Center for Internet Healthcare Zhengzhou 450000 Henan China;

Department of Computer Science New Jersey Institute of Technology Newark Newark NJ 07102 USA;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Data anonymization; L-diversity; Incomplete medical dataset; Missing data;

机译：数据匿名化； L多样性医疗数据集不完整；缺失数据;

相似文献

外文文献
中文文献
专利

1. IMPROVED K-ANONYMIZE AND L-DIVERSE APPROACH FOR PRIVACY PRESERVING BIG DATA PUBLISHING USING MPSEC DATASET [J] . Jain Priyank, Gyanchandani Manasi, Khare Nilay Computing and informatics . 2020,第3期

机译：改进了k-anymonize和l-不同的方法，用于使用mpsec数据集保留大数据发布的隐私权
2. Improved k-Anonymize and l-Diverse Approach for Privacy Preserving Big Data Publishing Using MPSEC Dataset [J] . Jain Priyank, Gyanchandani Manasi, Khare Nilay Computing and informatics . 2020,第3期

机译：改进了k-anymonize和l-不同的方法，用于使用mpsec数据集保留大数据发布的隐私权
3. Walking Without Friends: Publishing Anonymized Trajectory Dataset Without Leaking Social Relationships [J] . Zhao Kai, Tu Zhen, Xu Fengli, Network and Service Management, IEEE Transactions on . 2019,第3期

机译：没有朋友同行：在不泄露社会关系的情况下发布匿名的轨迹数据集
4. An Improved Data Anonymization Algorithm for Incomplete Medical Dataset Publishing [C] . Wei Liu, Mengli Pei, Congcong Cheng, International conference on healthcare science and engineering . 2019

机译：一种改进的数据匿名算法，用于不完整的医疗数据集发布
5. Fast Machine Learning Algorithms for Massive Datasets with Applications in the Biomedical Domain [D] . Sadrfaridpour, Ehsan. 2020

机译：用于生物医学域中的大规模数据集的快速机器学习算法
6. Attribute Utility Motivated k-anonymization of Datasets to Support the Heterogeneous Needs of Biomedical Researchers [O] . Huimin Ye, Elizabeth S. Chen 2011

机译：属性实用程序对数据集进行有动机的k匿名化以支持生物医学研究人员的异类需求
7. Improving the Utility of Anonymized Datasets through Dynamic Evaluation of Generalization Hierarchies [O] . Ayala-Rivera, Vanessa, Cerqueus, Thomas, Murphy, Liam, B.E., 2016

机译：通过动态评估泛化层次结构来提高匿名数据集的实用性

An Improved Data Anonymization Algorithm for Incomplete Medical Dataset Publishing

摘要

著录项

相似文献

相关主题

期刊订阅