首页> 外文OA文献 >Adaptive-Miner: an efficient distributed association rule mining algorithm on Spark
【2h】

Adaptive-Miner: an efficient distributed association rule mining algorithm on Spark

机译:Adaptive-Miner:火花上有效分布式关联规则挖掘算法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Abstract Extraction of valuable data from extensive datasets is a standout amongst the most vital exploration issues. Association rule mining is one of the highly used methods for this purpose. Finding possible associations between items in large transaction based datasets (finding frequent itemsets) is most crucial part of the association rule mining task. Many single-machine based association rule mining algorithms exist but the massive amount of data available these days is above the capacity of a single machine based algorithm. Therefore, to meet the demands of this ever-growing enormous data, there is a need for distributed association rule mining algorithm which can run on multiple machines. For these types of parallel/distributed applications, MapReduce is one of the best fault-tolerant frameworks. Hadoop is one of the most popular open-source software frameworks with MapReduce based approach for distributed storage and processing of large datasets using standalone clusters built from commodity hardware. But heavy disk I/O operation at each iteration of a highly iterative algorithm like Apriori makes Hadoop inefficient. A number of MapReduce based platforms are being developed for parallel computing in recent years. Among them, a platform, namely, Spark have attracted a lot of attention because of its inbuilt support to distributed computations. Therefore, we implemented a distributed association rule mining algorithm on Spark named as Adaptive-Miner which uses adaptive approach for finding frequent patterns with higher accuracy and efficiency. Adaptive-Miner uses an adaptive strategy based on the partial processing of datasets. Adaptive-Miner makes execution plans before every iteration and goes with the best suitable plan to minimize time and space complexity. Adpative-Miner is a dynamic association rule mining algorithm which change its approach based on the nature of dataset. Therefore, it is different and better than state-of-the-art static association rule mining algorithms. We conduct in-depth experiments to gain insight into the effectiveness, efficiency, and scalability of the Adaptive-Miner algorithm on Spark. Available: https://github.com/sanjaysinghrathi/Adaptive-Miner
机译:摘要从广泛的数据集中提取有价值的数据是最重要的探索问题中的突出。协会规则挖掘是为此目的的高度使用方法之一。在基于事务基于数据集(查找频繁项集)中的项目之间找到可能的关联是关联规则挖掘任务的最重要部分。许多基于机器基的关联规则挖掘算法存在,但这些天数可用的大量数据高于基于机器基于机器的算法的容量。因此,为了满足这种不断增长的巨大数据的需求,需要在多台机器上运行的分布式关联规则挖掘算法。对于这些类型的并行/分布式应用程序,MapReduce是最佳容错框架之一。 Hadoop是具有基于Mapreduce的开源软件框架之一,用于使用商品硬件构建的独立集群的分布式存储和大型数据集处理的方法之一。但是,沉重的磁盘I / O在每个迭代的高度迭代算法的操作,如Apriori会使Hadoop效率低下。近年来正在开发许多基于MapReduce的平台。其中,一个平台,即,由于其内置于分布式计算,Spark引起了很多关注。因此,我们在名为Adaptive-Miner的Spark上实施了一种分布式关联规则挖掘算法,它使用适应性方法来查找具有更高精度和效率的频繁模式。 Adaptive-Miner使用基于数据集的部分处理的自适应策略。 Adaptive-Miner在每一次迭代之前进行执行计划,并使用最佳合适的计划,以最大限度地减少时间和空间复杂性。 ADPative-Miner是一种动态关联规则挖掘算法,其基于数据集的性质改变其方法。因此,它与最先进的静态关联规则挖掘算法不同。我们进行深入的实验,以了解Adaptive-Miner算法的有效性,效率和可扩展性。可用:https://github.com/sanjaysinghrathi/adaptive-miner.

著录项

  • 作者

    Sanjay Rathee; Arti Kashyap;

  • 作者单位
  • 年度 2018
  • 总页数
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号