首页> 外文OA文献 >Adaptive-Miner: an efficient distributed association rule mining algorithm on Spark

【2h】

Adaptive-Miner: an efficient distributed association rule mining algorithm on Spark

机译：Adaptive-Miner：火花上有效分布式关联规则挖掘算法

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Abstract Extraction of valuable data from extensive datasets is a standout amongst the most vital exploration issues. Association rule mining is one of the highly used methods for this purpose. Finding possible associations between items in large transaction based datasets (finding frequent itemsets) is most crucial part of the association rule mining task. Many single-machine based association rule mining algorithms exist but the massive amount of data available these days is above the capacity of a single machine based algorithm. Therefore, to meet the demands of this ever-growing enormous data, there is a need for distributed association rule mining algorithm which can run on multiple machines. For these types of parallel/distributed applications, MapReduce is one of the best fault-tolerant frameworks. Hadoop is one of the most popular open-source software frameworks with MapReduce based approach for distributed storage and processing of large datasets using standalone clusters built from commodity hardware. But heavy disk I/O operation at each iteration of a highly iterative algorithm like Apriori makes Hadoop inefficient. A number of MapReduce based platforms are being developed for parallel computing in recent years. Among them, a platform, namely, Spark have attracted a lot of attention because of its inbuilt support to distributed computations. Therefore, we implemented a distributed association rule mining algorithm on Spark named as Adaptive-Miner which uses adaptive approach for finding frequent patterns with higher accuracy and efficiency. Adaptive-Miner uses an adaptive strategy based on the partial processing of datasets. Adaptive-Miner makes execution plans before every iteration and goes with the best suitable plan to minimize time and space complexity. Adpative-Miner is a dynamic association rule mining algorithm which change its approach based on the nature of dataset. Therefore, it is different and better than state-of-the-art static association rule mining algorithms. We conduct in-depth experiments to gain insight into the effectiveness, efficiency, and scalability of the Adaptive-Miner algorithm on Spark. Available: https://github.com/sanjaysinghrathi/Adaptive-Miner

机译：摘要从广泛的数据集中提取有价值的数据是最重要的探索问题中的突出。协会规则挖掘是为此目的的高度使用方法之一。在基于事务基于数据集（查找频繁项集）中的项目之间找到可能的关联是关联规则挖掘任务的最重要部分。许多基于机器基的关联规则挖掘算法存在，但这些天数可用的大量数据高于基于机器基于机器的算法的容量。因此，为了满足这种不断增长的巨大数据的需求，需要在多台机器上运行的分布式关联规则挖掘算法。对于这些类型的并行/分布式应用程序，MapReduce是最佳容错框架之一。 Hadoop是具有基于Mapreduce的开源软件框架之一，用于使用商品硬件构建的独立集群的分布式存储和大型数据集处理的方法之一。但是，沉重的磁盘I / O在每个迭代的高度迭代算法的操作，如Apriori会使Hadoop效率低下。近年来正在开发许多基于MapReduce的平台。其中，一个平台，即，由于其内置于分布式计算，Spark引起了很多关注。因此，我们在名为Adaptive-Miner的Spark上实施了一种分布式关联规则挖掘算法，它使用适应性方法来查找具有更高精度和效率的频繁模式。 Adaptive-Miner使用基于数据集的部分处理的自适应策略。 Adaptive-Miner在每一次迭代之前进行执行计划，并使用最佳合适的计划，以最大限度地减少时间和空间复杂性。 ADPative-Miner是一种动态关联规则挖掘算法，其基于数据集的性质改变其方法。因此，它与最先进的静态关联规则挖掘算法不同。我们进行深入的实验，以了解Adaptive-Miner算法的有效性，效率和可扩展性。可用：https://github.com/sanjaysinghrathi/adaptive-miner.

著录项

作者
Sanjay Rathee; Arti Kashyap;
展开▼
作者单位

展开▼
年度 2018
总页数
原文格式 PDF
正文语种 eng
中图分类

相似文献

外文文献
中文文献
专利

1. Adaptive-Miner: an efficient distributed association rule mining algorithm on Spark [J] . Sanjay Rathee, Arti Kashyap Journal of Big Data . 2018,第1期

机译：Adaptive-Miner：一种基于Spark的高效分布式关联规则挖掘算法
2. An Efficient Approach of Association Rule Mining on Distributed Database Algorithm [J] . Baocang Wang Journal of Computational Intelligence in Bioinformatics . 2019,第1期

机译：分布式数据库算法关联规则挖掘的有效方法
3. An Optimized Distributed Association Rule Mining Algorithm in Parallel and Distributed Data Mining with XML Data for Improved Response Time [J] . Sujni Paul International Journal of Computer Science & Information Technology (IJCSIT) . 2010,第2期

机译：XML数据并行和分布式数据挖掘中的优化分布式关联规则挖掘算法，可提高响应时间
4. A Novel Efficient Mining Association Rules Algorithm for Distributed Databases [C] . Liangzhong Shen International Conference on Advanced Measurement and Test . 2010

机译：分布式数据库的新型高效挖掘关联规则算法
5. Efficient sequential and parallel algorithms for mining association rules in text databases [D] . Holt, John D. 2003

机译：用于挖掘文本数据库中关联规则的高效顺序和并行算法
6. TSARM-UDP: An Efficient Time Series Association Rules Mining Algorithm Based on Up-to-Date Patterns [O] . Qiang Zhao, Qing Li, Deshui Yu, 2021

机译：TSARM-UDP：基于最新模式的有效时间序列关联规则挖掘算法
7. Enhancing association rules algorithms for mining distributed databases. Integration of fast BitTable and multi-agent association rules mining in distributed medical databases for decision support. [O] . Abdo Walid Adly Atteya 2012

机译：增强用于挖掘分布式数据库的关联规则算法。快速BitTable和多代理关联规则挖掘在分布式医疗数据库中的集成，以提供决策支持。

Adaptive-Miner: an efficient distributed association rule mining algorithm on Spark

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅