An efficient structure for fast mining high utility itemsets

Deng Zhi-Hong

首页> 外文期刊>Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies >An efficient structure for fast mining high utility itemsets

【24h】

An efficient structure for fast mining high utility itemsets

机译：快速采矿高效项目集的高效结构

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

High utility itemset mining has emerged to be an important research issue in data mining since it has a wide range of real life applications. Although a number of algorithms have been proposed in recent years, the mining efficiency is still a big challenge since these algorithms suffer from either the problem of low efficiency of calculating candidates' utilities or the problem of generating huge number of candidates. In this paper, we propose a novel data structure named PUN-list (PU-tree-Node list), which maintains both the utility information about an itemset and utility upper bound for facilitating the processing of mining high utility itemsets. Based on PUN-lists, we present a method, named MIP (Mining high utility Itemset using PUN-Lists), for efficiently mining high utility itemsets. The efficiency of MIP is achieved with three techniques. First, itemsets are represented by a highly condensed data structure, named PUN-list, which avoids costly and repeated utility computation. Second, the utility of an itemset can be efficiently calculated by scanning the PUN-list of the itemset and the PUN-lists of long itemsets can be efficiently constructed by the PUN-lists of short itemsets. Third, by employing the utility upper bound lying in the PUN-lists as the pruning strategy, MIP directly discovers high utility itemsets from the search space, named set-enumeration tree, without generating numerous candidates. Extensive experiments on various synthetic and real datasets show that MIP is very efficient since it is much faster than HUI-Miner, d2HUP, and UP-Growth + , especially on dense datasets.

机译：高实用程序项目集挖掘已成为数据挖掘的重要研究问题，因为它具有广泛的现实生活应用。尽管近年来已经提出了许多算法，但采矿效率仍然是一个很大的挑战，因为这些算法遭受了计算候选人的实用程序的低效率的问题或产生了大量候选人的问题。在本文中，我们提出了一个名为Pun-List（PU-Tree节点列表）的新型数据结构，该数据结构维护有关项目集和实用程序的实用程序信息，用于促进挖掘高实用程序集合的处理。基于PUN-LINK，我们呈现了一种名为MIP的方法（使用双关语上挖掘高实用程序项集），以有效地挖掘高实用程序项集。用三种技术实现MIP的效率。首先，项目集由一个名为pun-list的高度浓缩数据结构表示，避免了昂贵和重复的实用程序计算。其次，可以通过扫描项目集的双关个列表和长itement集的双关语列表来有效地计算项目集的实用程序，可以通过短项集的双关语列表有效地构建。第三，通过使用PUM-LIST中的实用程序上限作为修剪策略，MIP直接从名为SET-枚举树的搜索空间发现高实用程序项集，而不会生成众多候选者。关于各种合成和实际数据集的广泛实验表明，MIP非常有效，因为它比Hui-Miner，D2HUP和Up-Grown +更快，尤其是在密集的数据集上。

著录项

来源
《Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies》 |2018年第9期|共17页
作者
Deng Zhi-Hong;
展开▼
作者单位

Peking Univ Sch Elect Engn &

Comp Sci Key Lab Machine Percept Minist Educ Beijing 100871 Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词
Data structure; Data mining; High utility itemset; PUN-list; Utility mining;

机译：数据结构;数据挖掘;高实用程序项集;双关语;公用事业挖掘;

相似文献

外文文献
中文文献
专利

1. A fast high average-utility itemset mining with efficient tighter upper bounds and novel list structure [J] . Sethi Krishan Kumar, Ramesh Dharavath Journal of supercomputing . 2020,第12期

机译：一种快速的高平均实用程序项目集，具有高效更严格的上限和新颖的列表结构
2. EFIM: a fast and memory efficient algorithm for high-utility itemset mining [J] . Zida Souleymane, Fournier-Viger Philippe, Lin Jerry Chun-Wei, Knowledge and information systems . 2017,第2期

机译：EFIM：一种快速和记忆高效算法的高实用程序项集挖掘
3. Fast and memory efficient mining of high-utility itemsets from data streams: with and without negative item profits [J] . Hua-Fu Li, Hsin-Yun Huang, Suh-Yin Lee Knowledge and information systems . 2011,第3期

机译：快速且内存高效地从数据流中挖掘高功能项集：有或没有负项利润
4. Fast and Memory Efficient Mining of High Utility Itemsets in Data Streams [C] . Hua-Fu Li, Hsin-Yun Huang, Yi-Cheng Chen, International Conference on Data Mining . 2008

机译：快速和记忆高效挖掘数据流中的高实用程序项集
5. Efficiently mining frequent itemsets from very large databases. [D] . Zhu, Jianfei. 2004

机译：从大型数据库中有效地挖掘频繁的项目集。
6. HUIL-TN HUI-TN: Mining high utility itemsets based on pattern-growth [O] . Le Wang, Shui Wang 2021

机译：Huil-Tn＆Hui-TN：基于模式增长的矿业高实用项目集
7. DiffNodesets: An Efficient Structure for Fast Mining Frequent Itemsets [O] . Deng, Zhi-Hong 2015

机译：DiffNodesets：快速挖掘频繁项集的有效结构

An efficient structure for fast mining high utility itemsets

摘要

著录项

相似文献

相关主题

期刊订阅