PaWI: Parallel Weighted Itemset Mining by Means of MapReduce

机译：Pawi：通过MapReduce并行加权项集挖掘

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Frequent item set mining is an exploratory data mining technique that has fruitfully been exploited to extract recurrent co-occurrences between data items. Since in many application contexts items are enriched with weights denoting their relative importance in the analyzed data, pushing item weights into the item set mining process, i.e., Mining weighted item sets rather than traditional item sets, is an appealing research direction. Although many efficient in-memory weighted item set mining algorithms are available in literature, there is a lack of parallel and distributed solutions which are able to scale towards Big Weighted Data. This paper presents a scalable frequent weighted item set mining algorithm based on the MapReduce paradigm. To demonstrate its action ability and scalability, the proposed algorithm was tested on a real Big dataset collecting approximately 34 millions of reviews of Amazon items. Weights indicate the ratings given by users to the purchased items. The mined item sets represent combinations of items that were frequently bought together with an overall rating above average.

机译：频繁的项目集挖掘是一种探索性数据挖掘技术，效果果断地被剥削以在数据项之间提取复发性共同发生。由于在许多应用程序上下文中，物品被富裕，其重量表示它们在分析的数据中的相对重要性，将项目权重推动到项目集挖掘过程中，即，挖掘加权项目集而不是传统项目集，是一种吸引人的研究方向。虽然文献中有许多高效的内存加权项目集挖掘算法，但缺乏缺乏并行和分布式解决方案，能够朝大量数据扩展。本文介绍了一种基于MapReduce Paradigm的可伸缩频繁加权项集挖掘算法。为了展示其动作能力和可扩展性，所提出的算法在Real Big DataSet上进行了测试，收集约34千万亚马逊物品的评论。权重表明用户给购买物品给出的评级。 Mined Item Sets表示经常与高于平均水平一起购买的物品的组合。

著录项

来源
《IEEE International Congress on Big Data》|2015年||共8页
会议地点
作者
Baralis Elena; Cagliero Luca; Garza Paolo; Grimaudo Luigi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词
Data mining; H.2.8.b Clustering; and association rules; classification;

机译：数据挖掘;H.2.8.B群集;和关联规则;分类;

相似文献

外文文献
中文文献
专利

1. A Parallel MapReduce Algorithm to Efficiently Support Itemset Mining on High Dimensional Data [J] . Daniele Apiletti, Elena Baralis, Tania Cerquitelli, Big Data Research . 2017,第期

机译：一种并行MapReduce算法，可以有效地支持高维数据上的项目集挖掘
2. FiDoop: Parallel Mining of Frequent Itemsets Using MapReduce [J] . Xun Yaling, Zhang Jifu, Qin Xiao IEEE Transactions on Systems, Man, and Cybernetics . 2016,第3期

机译：FiDoop：使用MapReduce并行挖掘频繁项集
3. Weighted Support Association Rule Mining using Closed Itemset Lattices in Parallel [J] . A.M.J. Md. Zubair Rahman, P. Balasubramanie International journal of computer science and network security . 2009,第3期

机译：并行使用封闭项集格的加权支持关联规则挖掘
4. PaWI: Parallel Weighted Itemset Mining by Means of MapReduce [C] . Baralis Elena, Cagliero Luca, Garza Paolo, 2015 IEEE International Congress on Big Data . 2015

机译：PaWI：通过MapReduce并行加权项目集挖掘
5. Mining Frequent Itemsets from Uncertain Data: Extensions to Constrained Mining and Stream Mining. [D] . Hao, Boyu. 2010

机译：从不确定的数据中挖掘频繁项集：约束挖掘和流挖掘的扩展。
6. K-mer clustering algorithm using a MapReduce framework: application to the parallelization of the Inchworm module of Trinity [O] . Chang Sik Kim, Martyn D. Winn, Vipin Sachdeva, 2017

机译：使用MapReduce框架的K-mer聚类算法：在Trinity的Inchworm模块并行化中的应用
7. PaWI: Parallel Weighted Itemset Mining by means of MapReduce [O] . Baralis, Elena Maria, Cagliero, Luca, Garza, Paolo, 2015

机译：PaWI：通过MapReduce并行加权项目集挖掘

PaWI: Parallel Weighted Itemset Mining by Means of MapReduce

摘要

著录项

相似文献

相关主题

期刊订阅