首页> 外文期刊>Journal of information and computational science >An Efficient Algorithm of Frequent Itemsets Mining Based on MapReduce
【24h】

An Efficient Algorithm of Frequent Itemsets Mining Based on MapReduce

机译:一种基于MapReduce的频繁项集挖掘算法

获取原文
获取原文并翻译 | 示例
       

摘要

Mainstream parallel algorithms for mining frequent itemsets (patterns) were designed by implementing FP-Growth or Apriori algorithms on MapReduce (MR) framework. Existing MR FP-Growth algorithms can not distribute data equally among nodes, and MR Apriori algorithms utilize multiple map/reduce procedures and generate too many key-value pairs with value of 1; these disadvantages hinder their performance. This paper proposes an algorithm FIMMR: it firstly mines local frequent itemsets for each data chunk as candidates, applies prune strategies to the candidates, and then identifies global frequent itemsets from candidates. Experimental results show that the time efficiency of FIMMR outperforms PFP and SPC significantly; and under small minimum support threshold, FIMMR can achieve one order of magnitude improvement than the other two algorithms; meanwhile, the speedup of FIMMR is also satisfactory.
机译:通过在MapReduce(MR)框架上实现FP-Growth或Apriori算法,设计了用于挖掘频繁项集(模式)的主流并行算法。现有的MR FP-Growth算法无法在节点之间平均分配数据,并且MR Apriori算法利用多个映射/归约过程并生成太多值为1的键值对;这些缺点阻碍了它们的性能。本文提出了一种FIMMR算法:首先针对每个数据块挖掘本地频繁项集作为候选者,对候选者应用修剪策略,然后从候选者中识别全局频繁项集。实验结果表明,FMIMR的时间效率明显优于PFP和SPC。在最小支持阈值较小的情况下,FMIMR可以比其他两种算法提高一个数量级;同时,FMIMR的速度也令人满意。

著录项

  • 来源
    《Journal of information and computational science》 |2014年第8期|2809-2816|共8页
  • 作者单位

    School of Computer Science and Technology, Faculty of Electronic Information and Electrical Engineering, and School of Innovation and Experiment, Dalian University of Technology Dalian 116024, China;

    School of Computer Science and Technology, Faculty of Electronic Information and Electrical Engineering, and School of Innovation and Experiment, Dalian University of Technology Dalian 116024, China;

    School of Computer Science and Technology, Faculty of Electronic Information and Electrical Engineering, and School of Innovation and Experiment, Dalian University of Technology Dalian 116024, China;

    School of Computer Science and Technology, Faculty of Electronic Information and Electrical Engineering, and School of Innovation and Experiment, Dalian University of Technology Dalian 116024, China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Frequent Itemsets; Frequent Patterns; Big Data; MapReduce; Data Mining;

    机译:频繁项集;频繁模式;大数据;MapReduce;数据挖掘;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号