TKFIM: Top-K frequent itemset mining technique based on equivalence classes

Saood Iqbal; Abdul Shahid; Muhammad Roman; Zahid Khan; Shaha Al-Otaibi; Lisu Yu

首页> 外文期刊>PeerJ Computer Science >TKFIM: Top-K frequent itemset mining technique based on equivalence classes

【24h】

TKFIM: Top-K frequent itemset mining technique based on equivalence classes

机译：TKFIM：基于等同类的Top-K频繁项目集挖掘技术

获取原文

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Frequently used items mining is a significant subject of data mining studies. In the last ten years, due to innovative development, the quantity of data has grown exponentially. For frequent Itemset (FIs) mining applications, it imposes new challenges. Misconceived information may be found in recent algorithms, including both threshold and size based algorithms. Threshold value plays a central role in generating frequent itemsets from the given dataset. Selecting a support threshold value is very complicated for those unaware of the dataset’s characteristics. The performance of algorithms for finding FIs without the support threshold is, however, deficient due to heavy computation. Therefore, we have proposed a method to discover FIs without the support threshold, called Top-k frequent itemsets mining (TKFIM). It uses class equivalence and set-theory concepts for mining FIs. The proposed procedure does not miss any FIs; thus, accurate frequent patterns are mined. Furthermore, the results are compared with state-of-the-art techniques such as Top-k miner and Build Once and Mine Once (BOMO). It is found that the proposed TKFIM has outperformed the results of these approaches in terms of execution and performance, achieving 92.70, 35.87, 28.53, and 81.27 percent gain on Top-k miner using Chess, Mushroom, and Connect and T1014D100K datasets, respectively. Similarly, it has achieved a performance gain of 97.14, 100, 78.10, 99.70 percent on BOMO using Chess, Mushroom, Connect, and T1014D100K datasets, respectively. Therefore, it is argued that the proposed procedure may be adopted on a large dataset for better performance.

机译：经常使用的物品挖掘是数据挖掘研究的重要主题。在过去的十年中，由于创新发展，数据数量已指数增长。对于频繁的项目集（FIS）挖掘应用程序，它会冒充新挑战。最近的算法可以在近期算法中找到误判信息，包括基于阈值和大小的算法。阈值在从给定数据集中生成频繁的项目集中播放核心作用。对于那些未知的数据集的特征，选择支持阈值非常复杂。然而，在没有支持阈值的情况下寻找FIS的算法的性能是由于繁重的计算而缺乏。因此，我们已经提出了一种在没有支持阈值的情况下发现FIS的方法，称为Top-K频繁项目集挖掘（TKFIM）。它使用类等价和集合理论概念来挖掘FIS。拟议的程序不会错过任何FIS;因此，采用精确的频繁模式。此外，将结果与最先进的技术进行比较，例如Top-K矿工，并建立一次和挤在一起（Bomo）。有人发现，拟议的TKFIM在执行和绩效方面表现出这些方法的结果，分别使用国际象索，蘑菇和连接和T1014D100K数据集实现92.70,35.87,28.53和81.27％的增益。同样，它已经使用国际象索，蘑菇，连接和T1014D100K数据集实现了97.14,100,78.10,99.70％的性能增益。因此，认为可以在大型数据集上采用所提出的程序以获得更好的性能。

著录项

来源
《PeerJ Computer Science 》 |2021年第a期| 共27页
作者
Saood Iqbal; Abdul Shahid; Muhammad Roman; Zahid Khan; Shaha Al-Otaibi; Lisu Yu;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类计算技术、计算机技术 ;
关键词

相似文献

外文文献
中文文献
专利

1. PrePost(+): An efficient N-lists-based algorithm for mining frequent itemsets via Children-Parent Equivalence pruning [J] . Deng Zhi-Hong, Lv Sheng-Long Expert systems with applications . 2015 ,第13期

机译：PrePost（+）：一种有效的基于N列表的算法，用于通过“儿童-父母对等”修剪挖掘频繁项集
2. Mining top-k regular-frequent itemsets using database partitioning and support estimation [J] . Komate Amphawan, Philippe Lenca, Athasit Surarerks Expert Systems with Application . 2012 ,第2期

机译：使用数据库分区和支持估算来挖掘前k个经常性项目集
3. Mining top-k frequent closed itemsets over data streams using the sliding window model [J] . Pauray S.M. Tsai Expert systems with applications . 2010 ,第10期

机译：使用滑动窗口模型在数据流上挖掘前k个频繁关闭的项目集
4. Hardware Architectures for Frequent Itemset Mining Based on Equivalence Classes Partitioning [C] . Martin Letras, Raudel Hernández-León, Rene Cumplido IEEE International Parallel and Distributed Processing Symposium Workshops and PhD Forum . 2016

机译：基于等价类划分的频繁项集挖掘的硬件架构
5. Data mining techniques for frequent itemsets: Construction and analysis. [D] . Ramesh, Ganesh. 2003

机译：频繁项目集的数据挖掘技术：构造和分析。
6. Unravelling associations between unassigned mass spectrometry peaks with frequent itemset mining techniques [O] . Trung Nghia Vu, Aida Mrzic, Dirk Valkenborg, 2014

机译：利用频繁项集挖掘技术揭示未分配质谱峰之间的关联
7. TKFIM: Top-K frequent itemset mining technique based on equivalence classes [O] . Saood Iqbal, Abdul Shahid, Muhammad Roman, 2021

机译：TKFIM：基于等同类的Top-K频繁项目集挖掘技术

TKFIM: Top-K frequent itemset mining technique based on equivalence classes

摘要

著录项

相似文献

相关主题

期刊订阅