Proximity-Aware Local-Recoding Anonymization with MapReduce for Scalable Big Data Privacy Preservation in Cloud

Zhang Xuyun; Dou Wanchun; Pei Jian; Nepal Surya; Yang Chi; Liu Chang; Chen Jinjun

首页> 外文期刊>Computers, IEEE Transactions on >Proximity-Aware Local-Recoding Anonymization with MapReduce for Scalable Big Data Privacy Preservation in Cloud

【24h】

Proximity-Aware Local-Recoding Anonymization with MapReduce for Scalable Big Data Privacy Preservation in Cloud

机译：使用MapReduce的接近感知本地编码匿名化功能，可在云中扩展可扩展的大数据隐私

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Cloud computing provides promising scalable IT infrastructure to support various processing of a variety of big data applications in sectors such as healthcare and business. Data sets like electronic health records in such applications often contain privacy-sensitive information, which brings about privacy concerns potentially if the information is released or shared to third-parties in cloud. A practical and widely-adopted technique for data privacy preservation is to anonymize data via generalization to satisfy a given privacy model. However, most existing privacy preserving approaches tailored to small-scale data sets often fall short when encountering big data, due to their insufficiency or poor scalability. In this paper, we investigate the local-recoding problem for big data anonymization against proximity privacy breaches and attempt to identify a scalable solution to this problem. Specifically, we present a proximity privacy model with allowing semantic proximity of sensitive values and multiple sensitive attributes, and model the problem of local recoding as a proximity-aware clustering problem. A scalable two-phase clustering approach consisting of a -ancestors clustering (similar to -means) algorithm and a proximity-aware agglomerative clustering algorithm is proposed to address the above problem. We design the algorithms with MapReduce to gain high scalability by performing data-parallel computation in cloud. Extensive experiments on real-life data sets demonstrate that our approach significantly improves the capability of defending the proximity privacy breaches, the scalability and the time-efficiency of local-recoding anonymization over existing approaches.

机译：云计算提供了有希望的可扩展IT基础架构，以支持医疗保健和商业等领域中各种大数据应用程序的各种处理。在此类应用程序中，像电子健康记录这样的数据集通常包含对隐私敏感的信息，如果该信息被发布或共享给云中的第三方，则可能引起隐私问题。一种用于数据隐私保护的实用且被广泛采用的技术是通过泛化使数据匿名化，以满足给定的隐私模型。但是，大多数现有的针对小型数据集的隐私保护方法在遇到大数据时往往会因其功能不足或可伸缩性较差而无法实现。在本文中，我们调查了针对附近隐私侵犯的大数据匿名化的本地编码问题，并尝试确定该问题的可扩展解决方案。具体而言，我们提出了一种允许隐私值和多个敏感属性在语义上接近的接近度隐私模型，并将本地重新编码问题建模为接近度感知群集问题。为解决上述问题，提出了一种可扩展的两阶段聚类方法，该方法由-祖先聚类（类似于-means）算法和邻近感知聚结聚类算法组成。我们使用MapReduce设计算法，以通过在云中执行数据并行计算来获得高可伸缩性。在现实数据集上进行的大量实验表明，与现有方法相比，我们的方法显着提高了防御邻近隐私漏洞的能力，可伸缩性和本地重新编码匿名化的时间效率。

著录项

来源
《Computers, IEEE Transactions on》 |2015年第8期|2293-2307|共15页
作者
Zhang Xuyun; Dou Wanchun; Pei Jian; Nepal Surya; Yang Chi; Liu Chang; Chen Jinjun;
展开▼
作者单位

Faculty of Engineering and IT, University of Technology, Sydney, NSW, Australia;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Big Data; Big data; Cloud Computing; Data Anonymization; MapReduce; Proximity Privacy; cloud computing; data anonymization; mapreduce; proximity privacy;

机译：大数据;大数据;云计算;数据匿名化;MapReduce;邻近隐私;云计算;数据匿名化;mapreduce;邻近隐私;

相似文献

外文文献
中文文献
专利

1. Ameliorating the Privacy on Large Scale Aviation Dataset by Implementing MapReduce Multidimensional Hybrid k-Anonymization [J] . Stephen Dass A., Prabhu J. International journal of web portals . 2019,第2期

机译：通过实施MapReduce多维混合k匿名化改善大型航空数据集的隐私
2. Privacy preserving big data publishing: a scalable k-anonymization approach using MapReduce [J] . Brijesh B. Mehta, Udai Pratap Rao Software, IET . 2017,第5期

机译：隐私保护大数据发布：使用MapReduce的可扩展k匿名方法
3. A Scalable Two Phase Top Down Specialization Approach For Data Anonymization Using Mapreduce On Cloud [J] . Sameesha Vs International Journal of Computer Trends and Technology . 2017,第1期

机译：使用Mapreduce on Cloud的可扩展的两阶段自上而下的数据匿名化方法
4. A MapReduce Based Approach of Scalable Multidimensional Anonymization for Big Data Privacy Preservation on Cloud [C] . Zhang Xuyun, Yang Chi, Nepal Surya, 2013 IEEE Third International Conference on Cloud and Green Computing . 2013

机译：基于MapReduce的可伸缩多维匿名化方法用于云上大数据隐私保护
5. Scalable parallel computing on clouds: Efficient and scalable architectures to perform pleasingly parallel, MapReduce and iterative data intensive computations on cloud environments. [D] . Gunarathne, Thilina. 2014

机译：云上的可伸缩并行计算：高效且可伸缩的架构，可在云环境上执行令人满意的并行，MapReduce和迭代式数据密集型计算。
6. Privacy preserving data anonymization of spontaneous ADE reporting system dataset [O] . Wen-Yang Lin, Duen-Chuan Yang, Jie-Teng Wang 2016

机译：自发ADE报告系统数据集的隐私保护数据匿名化
7. A Survey on Data Anonymization Using Mapreduce on Cloud with Scalable Two-Phase Top-Down Approach [O] . M Dhasaratham, R P. Singh 2018

机译：使用可扩展两相自上而下方法使用MapReduce的数据匿名化调查

Proximity-Aware Local-Recoding Anonymization with MapReduce for Scalable Big Data Privacy Preservation in Cloud

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅