Accelerating Frequent Itemsets Mining on the Cloud: A MapReduce -Based Approach

机译：加速在云上的频繁项集挖掘：一种基于MapReduce的方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Frequent pattern mining has a critical role in mining associations, sequential patterns, correlations, causality, episodes, multidimensional patterns, emerging patterns, and many other significant data mining tasks. With the exponential growth of available data, most of the traditional frequent pattern mining algorithms become ineffective due to either huge resource requirements or large communications overhead. Cloud computing has proved that processing very large datasets over commodity clusters can be performed by providing the right programming model. As a parallel programming model, MapReduce, one of most important techniques for cloud computing, has emerged in the mining of datasets of terabyte scale or larger on clusters of computers. Converting a serial mining algorithm into a distributed algorithm on the MapReduce framework is not necessarily difficult, but the mining performance can be unsatisfactory. In this paper, we propose a method which finds all frequent item sets by using just two MapReduce phases in a time and communication efficient manner. We demonstrate experimental results to corroborate our theoretical claims.

机译：频繁模式挖掘在挖掘关联，顺序模式，相关性，因果关系，情节，多维模式，新兴模式以及许多其他重要数据挖掘任务中起着至关重要的作用。随着可用数据的指数增长，大多数传统的频繁模式挖掘算法由于巨大的资源需求或巨大的通信开销而变得无效。云计算已经证明，通过提供正确的编程模型，可以在商品集群上处理非常大的数据集。作为一种并行编程模型，MapReduce是云计算中最重要的技术之一，它已经在计算机集群中挖掘了TB级或更大容量的数据集。在MapReduce框架上将串行挖掘算法转换为分布式算法并不一定很困难，但是挖掘性能可能无法令人满意。在本文中，我们提出了一种方法，该方法通过仅使用两个MapReduce阶段以时间和通信有效的方式来查找所有频繁项目集。我们展示了实验结果，以证实我们的理论主张。

著录项

来源
《IEEE International Conference on Data Mining Workshops》|2013年|592-598|共7页
会议地点
作者
Farzanyar Zahra; Cercone Nick;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Big Data Mining; Cloud Computing; Frequent Itemset Mining; MapReduce;

机译：大数据挖掘;云计算;频繁项集挖掘; MapReduce;

相似文献

外文文献
中文文献
专利

1. Paradigm and performance analysis of distributed frequent itemset mining algorithms based on Mapreduce [J] . Xiao Wen, Hu Juan Microprocessors and microsystems . 2021,第Apra期

机译：基于MapReduce的分布式频繁项目集矿业算法的范例与性能分析
2. An Efficient Algorithm of Frequent Itemsets Mining Based on MapReduce [J] . Le Wang, Lin Feng, Jing Zhang, Journal of information and computational science . 2014,第8期

机译：一种基于MapReduce的频繁项集挖掘算法
3. The MapReduce Model on Cascading Platform for Frequent Itemset Mining [J] . Nur Rokhman, Amelia Nursanti Indonesian Journal of Computing and Cybernetics Systems . 2018,第2期

机译：级联频繁项集挖掘平台上的MapReduce模型
4. Accelerating Frequent Itemsets Mining on the Cloud: A MapReduce -Based Approach [C] . Zahra Farzanyar, Nick Cercone IEEE International Conference on Data Mining Workshops . 2013

机译：加速频繁的项目集在云上挖掘：基于MapReduce的方法
5. Constraint-based frequent itemset mining from data streams. [D] . Khan, Quamrul Islam. 2006

机译：从数据流中基于约束的频繁项集挖掘。
6. An efficient pattern growth approach for mining fault tolerant frequent itemsets [O] . Shariq Bashir -1

机译：挖掘容错频繁项集的有效模式增长方法
7. Frequent Itemset Mining Based on Development of FP-growth Algorithm and Use MapReduce Technique [O] . Zakria Mahrousa, Dima Mufti Alchawafa, Hasan Kazzaz 2021

机译：基于FP-Grangic算法的开发的频繁项目开采，并使用MapReduce技术

Accelerating Frequent Itemsets Mining on the Cloud: A MapReduce -Based Approach

摘要

著录项

相似文献

相关主题

期刊订阅