ANG: a combination of Apriori and graph computing techniques for frequent itemsets mining

Zhang Rui; Chen Wenguang; Hsu Tse-Chuan; Yang Hongji; Chung Yeh-Ching

首页> 外文期刊>Journal of supercomputing >ANG: a combination of Apriori and graph computing techniques for frequent itemsets mining

【24h】

ANG: a combination of Apriori and graph computing techniques for frequent itemsets mining

机译：ANG：Apriori和图计算技术的组合，用于频繁项集挖掘

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The Apriori algorithm is one of the most well-known and widely accepted methods for the association rule mining. In Apriori, it uses a prefix tree to represent k-itemsets, generates k-itemset candidates based on the frequent (k-1)-itemsets, and determines the frequent k-itemsets by traversing the prefix tree iteratively based on the transaction records. When k is small, the execution of Apriori is very efficient. However, the execution of Apriori could be very slow when k becomes large because of the deeper recursion depth to determine the frequent k-itemsets. From the perspective of graph computing, the transaction records can be converted to a graph G(V,E), where V is the set of vertices of G that represents the transaction records and E is the set of edges of G that represents the relations among transaction records. Each k-itemset in the transaction records will have a corresponding connected component in G. The number of vertices in the corresponding connected component is the support of the k-itemset. Since the time to find the corresponding connected component of a k-itemset in G is constant for any k, the graph computing method will be very efficient if the number of k-itemsets is relatively small. Based on Apriori and graph computing techniques, a hybrid method, called Apriori and Graph Computing (ANG), is proposed to compute the frequent itemsets. Initially, ANG uses Apriori to compute the frequent k-itemsets and then switches to the graph computing method when k becomes large (where the number of k-itemset candidates is relatively small). The experimental results show that ANG outperforms both Apriori and the graph computing method for all test cases.

机译：Apriori算法是用于关联规则挖掘的最著名和广泛接受的方法之一。在Apriori中，它使用前缀树表示k个项集，根据频繁的（k-1）个项集生成k个项集，并根据交易记录迭代遍历该前缀树来确定频繁的k个项集。当k较小时，Apriori的执行非常有效。但是，当k变大时，Apriori的执行可能会非常慢，这是因为要确定频繁的k项集的递归深度会更深。从图计算的角度来看，可以将交易记录转换为图G（V，E），其中V是代表交易记录的G的顶点集，E是代表关系的G的边集在交易记录中。事务记录中的每个k-itemset将在G中具有一个相应的连接组件。相应的连接组件中的顶点数是k-itemset的支持。由于对于任何k，在G中找到k个项集的对应连接分量的时间都是恒定的，因此，如果k个项集的数量相对较小，则图形计算方法将非常有效。基于Apriori和图计算技术，提出了一种称为Apriori和图计算（ANG）的混合方法来计算频繁项集。最初，ANG使用Apriori来计算频繁的k个项集，然后在k变大（其中k个项集候选的数量相对较少）时切换到图形计算方法。实验结果表明，对于所有测试用例，ANG的性能均优于Apriori和图形计算方法。

著录项

来源
《Journal of supercomputing》 |2019年第2期|646-661|共16页
作者
Zhang Rui; Chen Wenguang; Hsu Tse-Chuan; Yang Hongji; Chung Yeh-Ching;
展开▼
作者单位

Tsinghua Univ, Grad Sch Shenzhen, Shenzhen 518057, Peoples R China;

Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China;

BathSpa Univ, Ctr Creat Comp, Bath, Avon, England;

BathSpa Univ, Ctr Creat Comp, Bath, Avon, England;

Tsinghua Univ Shenzhen, Res Inst, Shenzhen 518057, Peoples R China;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Apriori; Graph computing; Frequent itemset mining; Data mining;

机译：先验;图计算;频繁项集挖掘;数据挖掘;

相似文献

外文文献
中文文献
专利

1. Scalable Frequent Itemset Mining Using Heterogeneous Computing : parApriori Algorithm [J] . V. B. Nikam, B. B. Meshram International Journal of Distributed and Parallel Systems . 2014,第5期

机译：使用异构计算的可扩展频繁项集挖掘：parApriori算法
2. EEAAFIT: Enhanced Efficiency of Apriori Approach for Frequent Itemset Techniques [J] . Ramalingam Sugumara, M. Appas Ali International Journal of Applied Engineering Research . 2018,第11aPta5期

机译：eeaafit：频繁项目集技术的APRIORI方法的提高
3. EAFIM: efficient apriori-based frequent itemset mining algorithm on Spark for big transactional data [J] . Knowledge and information systems . 2020,第9期

机译：EAFIM：高效基于APRiori的频繁项目集挖掘算法，用于大事务数据的火花
4. Apriori, Association Rules, Data Mining,Frequent Itemsets Mining (FIM), Parallel Computing [C] . Yoshikawa M., Terai H. Software Engineering Research, Management and Applications, 2006. Fourth International Conference on . 2006

机译：Apriori，关联规则，数据挖掘，频繁项集挖掘（FIM），并行计算
5. Mining Frequent Itemsets Using Improved Apriori on Spark [D] . Khandelwal, Ashutosh. 2017

机译：在Spark上使用改进的Apriori挖掘频繁项集
6. Unravelling associations between unassigned mass spectrometry peaks with frequent itemset mining techniques [O] . Trung Nghia Vu, Aida Mrzic, Dirk Valkenborg, 2014

机译：利用频繁项集挖掘技术揭示未分配质谱峰之间的关联
7. ANG - a combination of Apriori and graph computing techniques for frequent itemsets mining [O] . Zhang, R, Chen, W, Hsu, T, 2017

机译：aNG - 用于频繁项目集挖掘的apriori和图形计算技术的组合

ANG: a combination of Apriori and graph computing techniques for frequent itemsets mining

摘要

著录项

相似文献

相关主题

期刊订阅