Smart frequent itemsets mining algorithm based on FP-tree and DIFFset data structures

GEORGE GATUHA; TAO JIANG

首页> 外文期刊>Turkish Journal of Electrical Engineering and Computer Sciences >Smart frequent itemsets mining algorithm based on FP-tree and DIFFset data structures

【24h】

Smart frequent itemsets mining algorithm based on FP-tree and DIFFset data structures

机译：基于FP-tree和DIFFset数据结构的智能频繁项集挖掘算法

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Association rule data mining is an important technique for finding important relationships in large datasets. Several frequent itemsets mining techniques have been proposed using a prefix-tree structure, FP-tree, a compressed data structure for database representation. The DIFFset data structure has also been shown to significantly reduce the run time and memory utilization of some data mining algorithms. Experimental results have demonstrated the efficiency of the two data structures in frequent itemsets mining. This work proposes FDM, a new algorithm based on FP-tree and DIFFset data structures for efficiently discovering frequent patterns in data. FDM can adapt its characteristics to efficiently mine long and short patterns from both dense and sparse datasets. Several optimization techniques are also outlined to increase the efficiency of FDM. An evaluation of FDM against three frequent itemset data mining algorithms, dEclat, FP-growth, and FDM* (FDM without optimization), was performed using datasets having both long and short frequent patterns. The experimental results show significant improvement in performance compared to the FP-growth, dEclat, and FDM* algorithms.

机译：关联规则数据挖掘是一种在大型数据集中查找重要关系的重要技术。已经提出了使用前缀树结构，FP树（一种用于数据库表示的压缩数据结构）的几种频繁项集挖掘技术。还显示了DIFFset数据结构可显着减少某些数据挖掘算法的运行时间和内存利用率。实验结果证明了这两种数据结构在频繁项集挖掘中的效率。这项工作提出了FDM，这是一种基于FP-tree和DIFFset数据结构的新算法，可以有效地发现数据中的频繁模式。 FDM可以适应其特征，以从密集和稀疏数据集中有效地挖掘长短模式。还概述了几种优化技术，以提高FDM的效率。使用具有长期和短期频繁模式的数据集，针对dEclat，FP-growth和FDM *（未经优化的FDM）这三种频繁项数据挖掘算法对FDM进行了评估。实验结果表明，与FP-growth，dEclat和FDM *算法相比，性能有了显着提高。

著录项

来源
《Turkish Journal of Electrical Engineering and Computer Sciences》 |2017年第3期|共12页
作者
GEORGE GATUHA; TAO JIANG;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类工业经济;
关键词

相似文献

外文文献
中文文献
专利

1. Parallelization of Frequent Itemset Mining Methods with FP-tree: An Experiment with PrePost~+ Algorithm [J] . Jamsheela Olakara, Gopalakrishna Raju The international arab journal of information technology . 2021,第2期

机译：使用FP-Tree的频繁项目集挖掘方法的并行化：Prepost〜+算法的实验
2. EFFICIENT SUBSET-LATTICE ALGORITHMS FOR MINING CLOSED FREQUENT ITEMSETS AND MAXIMAL FREQUENT ITEMSETS IN DATA STREAMS [J] . Ye-In Chang, Chia-En Li, Wei-Hau Peng, International Journal of Electrical Engineering: Transactions of the Chinese Institute of Engineers, Series E . 2013,第2期

机译：高效的子格算法，用于挖掘数据流中的封闭频率项和最大频率项
3. A Frequent Pattern Mining Algorithm Based on FP-Tree Structure and Apriorl Algorithm [J] . M Suman, T Anuradha, K Gowtham, International Journal of Engineering Research and Applications . 2012,第1期

机译：基于FP-Tree结构和Apriori算法的频繁模式挖掘算法
4. Frequent Itemsets Mining Algorithm based On Differential Privacy and FP-Tree [C] . Ding Zhe, Chunwang Wu, Zhao Jun, International Computer Conference on Wavelet Active Media Technology and Information Processing . 2020

机译：基于差分隐私和FP-Tree的频繁项目集挖掘算法
5. New algorithms for frequent sequential pattern and itemset data mining in certain and uncertain databases. [D] . Peterson, Erich Allen. 2012

机译：在某些不确定数据库中频繁进行顺序模式和项集数据挖掘的新算法。
6. Bit-Table Based Biclustering and Frequent Closed Itemset Mining in High-Dimensional Binary Data [O] . András Király, Attila Gyenesei, János Abonyi -1

机译：高位二进制数据中基于位表的聚类和频繁封闭项集挖掘
7. Smart frequent itemsets mining algorithm based on FP-tree and DIFFset data structures [O] . George GATUHA, Tao JIANG 2017

机译：基于FP树和差异数据结构的智能频繁项目集挖掘算法

Smart frequent itemsets mining algorithm based on FP-tree and DIFFset data structures

摘要

著录项

相似文献

相关主题

期刊订阅