首页> 外文期刊>Applied artificial intelligence >Privacy Preserving Parallel Clustering Based Anonymization for Big Data Using MapReduce Framework
【24h】

Privacy Preserving Parallel Clustering Based Anonymization for Big Data Using MapReduce Framework

机译:基于MapReduce框架的大数据隐私保护并行聚类匿名化

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Big data refers to a massive volume of data collected from heterogeneous data sources including data collected from Internet of Things (IoT) devices. Big data analytics is playing a crucial role in extracting patterns that would benefit efficient and effective decision making. Processing this massive volume of data poses several critical issues such as scalability, security and privacy. To preserve data privacy, numerous privacy-preserving data mining and publishing techniques exist. Data anonymization utilizing data mining techniques for preserving an individual's privacy is a promising approach to prevent the data against identity disclosure. In this paper, a Parallel Clustering based Anonymization Algorithm (PCAA) is proposed, and the results prove that the algorithm is scalable and also achieves a better tradeoff between privacy and utility. The MapReduce framework is used to parallelize the anonymization process for handling a huge volume of data. The algorithm performs well in terms of classification accuracy, F-measure, and Kullback-Leibler divergence metrics. Moreover, the big data generated from heterogeneous data sources are efficiently protected to meet the ever-growing requirements of the application.
机译:大数据是指从异构数据源收集的大量数据,包括从物联网 (IoT) 设备收集的数据。大数据分析在提取有利于高效决策的模式方面发挥着至关重要的作用。处理如此庞大的数据会带来几个关键问题,例如可扩展性、安全性和隐私性。为了保护数据隐私,存在许多隐私保护数据挖掘和发布技术。利用数据挖掘技术来保护个人隐私的数据匿名化是一种很有前途的方法,可以防止数据被身份泄露。该文提出一种基于并行聚类的匿名化算法(PCAA),结果表明该算法具有可扩展性,并且实现了更好的隐私和效用之间的权衡。MapReduce框架用于并行化匿名化过程,以处理大量数据。该算法在分类准确率、F-measure 和 Kullback-Leibler 散度指标方面表现良好。此外,从异构数据源生成的大数据得到有效保护,以满足应用程序不断增长的需求。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号