首页> 外文期刊>Knowledge-Based Systems >Index-BitTableFI: An improved algorithm for mining frequent itemsets
【24h】

Index-BitTableFI: An improved algorithm for mining frequent itemsets

机译:Index-BitTableFI:一种改进的频繁项集挖掘算法

获取原文
获取原文并翻译 | 示例
           

摘要

Efficient algorithms for mining frequent itemsets are crucial for mining association rules as well as for many other data mining tasks. Methods for mining frequent itemsets have been implemented using a Bit-Table structure. BitTableFI is such a recently proposed efficient BitTable-based algorithm, which exploits BitTable both horizontally and vertically. Although making use of efficient bit wise operations, BitTableFI still may suffer from the high cost of candidate generation and test. To address this problem, a new algorithm Index-BitTableFI is proposed. Index-BitTableFI also uses BitTable horizontally and vertically. To make use of BitTable horizontally, index array and the corresponding computing method are proposed. By computing the subsume index, those itemsets that co-occurrence with representative item can be identified quickly by using breadth-first search at one time. Then, for the resulting itemsets generated through the index array, depth-first search strategy is used to generate all other frequent itemsets. Thus, the hybrid search is implemented, and the search space is reduced greatly. The advantages of the proposed methods are as follows. On the one hand, the redundant operations on intersection of tidsets and frequency-checking can be avoided greatly; On the other hand, it is proved that frequent itemsets, including representative item and having the same supports as representative item, can be identified directly by connecting the representative item with all the combinations of items in its subsume index. Thus, the cost for processing this kind of itemsets is lowered, and the efficiency is improved. Experimental results show that the proposed algorithm is efficient especially for dense datasets.
机译:挖掘频繁项集的高效算法对于挖掘关联规则以及许多其他数据挖掘任务至关重要。挖掘频繁项集的方法已使用位表结构实现。 BitTableFI是最近提出的一种高效的基于BitTable的算法,它在水平和垂直方向上都利用BitTable。尽管利用了高效的按位运算,但BitTableFI仍可能遭受候选者生成和测试的高昂费用。为了解决这个问题,提出了一种新算法Index-BitTableFI。 Index-BitTableFI还水平和垂直使用BitTable。为了水平利用BitTable,提出了索引数组和相应的计算方法。通过计算使用者索引,可以一次使用广度优先搜索来快速识别与代表项同时出现的那些项集。然后,对于通过索引数组生成的结果项集,深度优先搜索策略用于生成所有其他频繁项集。这样,实现了混合搜索,大大减少了搜索空间。所提出的方法的优点如下。一方面,可以极大地避免在花序和频率校验相交处的冗余操作。另一方面,证明了通过将代表项目与其所属索引中的所有项目组合相连接,可以直接识别包括代表项目并且具有与代表项目相同的支持的频繁项目集。因此,降低了处理这种项目集的成本,并且提高了效率。实验结果表明,该算法对密集数据集有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号