首页> 外文期刊>Computers and Electrical Engineering >Performance optimization of MapReduce-based Apriori algorithm on Hadoop cluster
【24h】

Performance optimization of MapReduce-based Apriori algorithm on Hadoop cluster

机译:基于MapReduce的Apriori算法在Hadoop集群中的性能优化

获取原文
获取原文并翻译 | 示例
           

摘要

Many techniques have been proposed to implement the Apriori algorithm on MapReduce framework but only a few have focused on performance improvement. FPC (Fixed Passes Combined-counting) and DPC (Dynamic Passes Combined-counting) algorithms combine multiple passes of Apriori in a single MapReduce phase to reduce the execution time. In this paper, we propose improved MapReduce based Apriori algorithms VFPC (Variable Size based Fixed Passes Combined-counting) and ETDPC (Elapsed Time based Dynamic Passes Combined-counting) over FPC and DPC. Further, we optimize the multi-pass phases of these algorithms by skipping pruning step in some passes, and propose Optimized-VFPC and Optimized-ETDPC algorithms. Quantitative analysis reveals that counting cost of additional un-pruned candidates produced due to skipped-pruning is less significant than reduction in computation cost due to the same. Experimental results show that VFPC and ETDPC are more robust and flexible than FPC and DPC whereas their optimized versions are more efficient in terms of execution time. (C) 2017 Elsevier Ltd. All rights reserved.
机译:已经提出了许多技术在MapReduce框架上实施APRiori算法,但只有少数人专注于性能改进。 FPC(固定通过组合计数)和DPC(动态通过组合计数)算法在单个MapReduce阶段中组合了多次Apriori的传递,以减少执行时间。在本文中,我们提出了改进的基于MAPRiori算法VFPC(基于变量的固定通过组合计数)和ETDPC(基于时间的动态通过组合计数),通过FPC和DPC。此外,我们通过在一些通行证中跳过修剪步骤来优化这些算法的多通相阶段,并提出优化-VFPC和优化-ETDPC算法。定量分析表明,由于跳闸剪枝产生的额外未修剪候选者的计数成本不如由于相同的计算成本降低。实验结果表明,VFPC和ETDPC比FPC和DPC更强大,灵活,而其优化版本在执行时间方面更有效。 (c)2017 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号