首页> 外文期刊>Expert Systems with Application >negFIN: An efficient algorithm for fast mining frequent itemsets
【24h】

negFIN: An efficient algorithm for fast mining frequent itemsets

机译:negFIN:一种快速挖掘频繁项集的有效算法

获取原文
获取原文并翻译 | 示例

摘要

Frequent itemset mining is a basic data mining task and has numerous applications in other data mining tasks. In recent years, some data structures based on sets of nodes in a prefix tree have been presented. These data structures store essential information about frequent itemsets. In this paper, we propose another efficient data structure, NegNodeset. Similar to other such data structures, the basis of NegNodeset is sets of nodes in a prefix tree. NegNodeset employs a novel encoding model for nodes in a prefix tree based on the bitmap representation of sets. Based on the NegNodeset data structure, we propose negFIN, which is an efficient algorithm for frequent itemset mining. The efficiency of the negFIN algorithm is confirmed by the following three reasons: (1) the NegNodesets of itemsets are extracted using bitwise operators, (2) the complexity of calculating NegNodesets and counting supports is reduced to O(n), where n is the cardinality of NegNodeset, and (3) it employs a set-enumeration tree to generate frequent itemsets and uses a promotion method to prune the search space in this tree. Our extensive performance study on a variety of benchmark datasets indicates that negFIN is the fastest algorithm, compared with previous state-of-the-art algorithms. However, our algorithm runs with the same speed as dFIN on some datasets. (C) 2018 Elsevier Ltd. All rights reserved.
机译:频繁项集挖掘是一项基本的数据挖掘任务,并且在其他数据挖掘任务中具有许多应用程序。近年来,已经提出了一些基于前缀树中的节点集的数据结构。这些数据结构存储有关频繁项目集的基本信息。在本文中,我们提出了另一个有效的数据结构NegNodeset。与其他此类数据结构类似,NegNodeset的基础是前缀树中的节点集。 NegNodeset基于集合的位图表示,为前缀树中的节点采用新颖的编码模型。基于NegNodeset数据结构,我们提出negFIN,这是一种用于频繁项集挖掘的有效算法。 negFIN算法的效率由以下三个原因证实:(1)使用按位运算符提取项目集的NegNodeset,(2)计算NegNodeset和计数支持的复杂度降低为O(n),其中n为NegNodeset的基数;(3)它使用集合枚举树生成频繁项集,并使用提升方法来修剪该树中的搜索空间。我们对各种基准数据集的广泛性能研究表明,与以前的最新算法相比,negFIN是最快的算法。但是,我们的算法在某些数据集上的运行速度与dFIN相同。 (C)2018 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号