PaWI: Parallel Weighted Itemset Mining by Means of MapReduce

机译：PaWI：通过MapReduce并行加权项目集挖掘

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Frequent item set mining is an exploratory data mining technique that has fruitfully been exploited to extract recurrent co-occurrences between data items. Since in many application contexts items are enriched with weights denoting their relative importance in the analyzed data, pushing item weights into the item set mining process, i.e., Mining weighted item sets rather than traditional item sets, is an appealing research direction. Although many efficient in-memory weighted item set mining algorithms are available in literature, there is a lack of parallel and distributed solutions which are able to scale towards Big Weighted Data. This paper presents a scalable frequent weighted item set mining algorithm based on the MapReduce paradigm. To demonstrate its action ability and scalability, the proposed algorithm was tested on a real Big dataset collecting approximately 34 millions of reviews of Amazon items. Weights indicate the ratings given by users to the purchased items. The mined item sets represent combinations of items that were frequently bought together with an overall rating above average.

机译：频繁项集挖掘是一种探索性的数据挖掘技术，已被有效地用于提取数据项之间的重复出现。由于在许多应用上下文中，项目都富含表示其在分析数据中的相对重要性的权重，因此将项目权重推入项目集挖掘过程（即，挖掘加权项目集而不是传统项目集）是一个有吸引力的研究方向。尽管文献中提供了许多有效的内存中加权项目集挖掘算法，但仍缺乏能够扩展到大加权数据的并行和分布式解决方案。本文提出了一种基于MapReduce范式的可伸缩频繁加权项目集挖掘算法。为了展示其动作能力和可伸缩性，在真实的Big数据集上对提出的算法进行了测试，该数据集收集了大约3400万条亚马逊商品评论。权重表示用户对所购买物品的评级。开采的项目集表示经常购买的项目组合以及总体评级高于平均水平。

著录项

来源
《2015 IEEE International Congress on Big Data》|2015年|25-32|共8页
会议地点 New York NY(US)
作者
Baralis Elena; Cagliero Luca; Garza Paolo; Grimaudo Luigi;
展开▼
作者单位

Dipt. di Autom. e Inf., Politec. di Torino, Turin, Italy;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Data mining; H.2.8.b Clustering; and association rules; classification;

机译：数据挖掘; H.2.8.b聚类;关联规则;分类;

相似文献

外文文献
中文文献
专利

1. A Parallel MapReduce Algorithm to Efficiently Support Itemset Mining on High Dimensional Data [J] . Daniele Apiletti, Elena Baralis, Tania Cerquitelli, Big Data Research . 2017,第期

机译：一种并行MapReduce算法，可以有效地支持高维数据上的项目集挖掘
2. FiDoop: Parallel Mining of Frequent Itemsets Using MapReduce [J] . Xun Yaling, Zhang Jifu, Qin Xiao IEEE Transactions on Systems, Man, and Cybernetics . 2016,第3期

机译：FiDoop：使用MapReduce并行挖掘频繁项集
3. Weighted Support Association Rule Mining using Closed Itemset Lattices in Parallel [J] . A.M.J. Md. Zubair Rahman, P. Balasubramanie International journal of computer science and network security . 2009,第3期

机译：并行使用封闭项集格的加权支持关联规则挖掘
4. PaWI: Parallel Weighted Itemset Mining by Means of MapReduce [C] . Baralis Elena, Cagliero Luca, Garza Paolo, IEEE International Congress on Big Data . 2015

机译：Pawi：通过MapReduce并行加权项集挖掘
5. Mining Frequent Itemsets from Uncertain Data: Extensions to Constrained Mining and Stream Mining. [D] . Hao, Boyu. 2010

机译：从不确定的数据中挖掘频繁项集：约束挖掘和流挖掘的扩展。
6. K-mer clustering algorithm using a MapReduce framework: application to the parallelization of the Inchworm module of Trinity [O] . Chang Sik Kim, Martyn D. Winn, Vipin Sachdeva, 2017

机译：使用MapReduce框架的K-mer聚类算法：在Trinity的Inchworm模块并行化中的应用
7. PaWI: Parallel Weighted Itemset Mining by means of MapReduce [O] . Baralis, Elena Maria, Cagliero, Luca, Garza, Paolo, 2015

机译：PaWI：通过MapReduce并行加权项目集挖掘

PaWI: Parallel Weighted Itemset Mining by Means of MapReduce

摘要

著录项

相似文献

相关主题

期刊订阅