...
首页> 外文期刊>Parallel Processing Letters >MR-ARM: A MAP-REDUCE ASSOCIATION RULE MINING FRAMEWORK
【24h】

MR-ARM: A MAP-REDUCE ASSOCIATION RULE MINING FRAMEWORK

机译:MR-ARM:一种简化地图的关联规则挖掘框架

获取原文
获取原文并翻译 | 示例

摘要

Association rule is one of the primary tasks in data mining that discovers correlations among items in a transactional database. The majority of vertical and horizontal association rule mining algorithms have been developed to improve the frequent items discovery step which necessitates high demands on training time and memory usage particularly when the input database is very large. In this paper, we overcome the problem of mining very large data by proposing a new parallel Map-Reduce (MR) association rule mining technique called MR-ARM that uses a hybrid data transformation format to quickly finding frequent items and generating rules. The MR programming paradigm is becoming popular for large scale data intensive distributed applications due to its efficiency, simplicity and ease of use, and therefore the proposed algorithm develops a fast parallel distributed batch set intersection method for finding frequent items. Two implementations (Weka, Hadoop) of the proposed MR association rule algorithm have been developed and a number of experiments against small, medium and large data collections have been conducted. The ground bases of the comparisons are time required by the algorithm for: data initialisation, frequent items discovery, rule generation, etc. The results show that MR-ARM is very useful tool for mining association rules from large datasets in a distributed environment.
机译:关联规则是数据挖掘中发现事务数据库中项目之间关联的主要任务之一。已经开发了大多数垂直和水平关联规则挖掘算法,以改进频繁的项目发现步骤,这对训练时间和内存使用有很高的要求,尤其是在输入数据库非常大的情况下。在本文中,我们通过提出一种称为MR-ARM的新并行Map-Reduce(MR)关联规则挖掘技术来克服挖掘非常大的数据的问题,该技术使用混合数据转换格式来快速查找频繁项并生成规则。由于其效率,简单性和易用性,MR编程范式在大规模数据密集型分布式应用中正变得越来越流行,因此,所提出的算法开发了一种快速并行的分布式批处理集相交方法,用于查找频繁项。已经开发了所提出的MR关联规则算法的两种实现方式(Weka,Hadoop),并且已经针对小型,中型和大型数据收集进行了许多实验。比较的基础是算法所需的时间,这些时间包括:数据初始化,频繁项发现,规则生成等。结果表明,MR-ARM是从分布式环境中的大型数据集中挖掘关联规则的非常有用的工具。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号