基于MapReduce的Apriori算法并行化改进

秦军; 郝天曙; 董倩倩

首页> 中文期刊> 《计算机技术与发展》 >基于MapReduce的Apriori算法并行化改进

基于MapReduce的Apriori算法并行化改进

开具论文收录证明 >>

期刊封面封底目录下载 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

基于MapReduce的并行Apriori算法解决了传统Apriori算法多次扫描数据库的问题,但是其候选集仍然由频繁项集经过串行自连接产生,并产生了大量的候选集中间数据.为了提高Apriori算法挖掘频繁项集的效率,在基于MapReduce的Apriori算法的基础上对连接步进行并行化改进,提出大数据环境下挖掘频繁项目集的新算法-CApriori算法.新算法通过Map、Reduce过程从频繁 k- 项集中并行得到 k+1 项候选集,使得Apriori算法产生频繁项集的整个过程并行化,减少了迭代过程中候选集数目,节约了存储空间和时间开销.通过对时间复杂度进行分析比较,改进算法在处理大规模数据时会大大减少连接步的时间消耗.将CApriori算法在Hadoop平台上进行了实验,结果表明改进算法在大数据和较小支持度环境下都具有更高的效率,且能取得优异的加速功能.%The parallel Apriori algorithm based on the MapReduce solves the problem that the traditional Apriori algorithm scans database for many times,but the candidates are still generated from the connection of serial by the frequent itemsets and generate a large number of data.In order to improve the efficiency of mining frequent itemsets for Apriori,an improved parallel Apriori algorithm named CApriori is proposed in large data environment,which realizes parallel candidate generation steps under MapReduce framework.The new algorithm generates the k+1 candidate itemsets from k frequent itemsets through the process of Map and Reduce,which makes the whole process of generating frequent item sets in parallel,reducing the number of candidate sets,saving storage space and time overhead.On analysis of the time complexity of CApriori algorithm and Apriori algorithm,it indicates that CApriori algorithm reduces the time consumed when connected in dealing with large-scale data.CApriori is executed on Hadoop platform and the results show that the improved algorithm in big data environment and smaller support is more efficient,and can obtain excellent acceleration.

著录项

来源
《计算机技术与发展》 |2017年第4期|64-68|共5页
作者
秦军; 郝天曙; 董倩倩;
展开▼
作者单位

南京邮电大学教育科学与技术学院;

江苏南京 210003;

南京邮电大学计算机学院;

江苏南京 210003;

南京邮电大学计算机学院;

江苏南京 210003;

展开▼
原文格式 PDF
正文语种 chi
中图分类算法理论;
关键词
关联规则; 数据挖掘; MapReduce; Apriori;

相似文献

中文文献
外文文献
专利

1. 基于MapReduce的Apriori算法并行化研究 [J] . 谢志明 . 宁波职业技术学院学报 . 2015,第005期
2. 基于MapReduce的Apriori算法并行化 [J] . 林长方 ,吴扬扬 ,黄仲开 . 江南大学学报（自然科学版） . 2014,第004期
3. 基于改进MapReduce模型的BP神经网络并行化研究 [J] . 李楠 ,于孟渤 ,贾珍珍 . 通信技术 . 2018,第004期
4. 基于MapReduce-HBase的Apriori算法的改进与研究 [J] . 程阳 ,章韵 . 南京邮电大学学报（自然科学版） . 2018,第005期
5. 基于MapReduce的改进的Apriori算法及其应用研究 [J] . 赵月 ,任永功 ,刘洋 . 计算机科学 . 2017,第006期
6. 基于MapReduce的MIC算法并行化 [C] . Cai Guo-Yong ,蔡国永 ,Lv Rui . 2014全国高性能计算学术年会 . 2014
7. 并行化Apriori算法的改进及其应用 [A] . 王帏韬 . 2019

基于MapReduce的Apriori算法并行化改进

摘要

著录项

相似文献

相关主题

期刊订阅