首页> 外文期刊>Journal of software >An Algorithm of Top-k High Utility Itemsets Mining over Data Stream
【24h】

An Algorithm of Top-k High Utility Itemsets Mining over Data Stream

机译:数据流中Top-i> k 高效项集挖掘算法

获取原文
           

摘要

Existing top-k high utility itemset (HUI) mining algorithms generate candidate itemsets in the mining process; their timeandspace performance might be severely affected when the dataset is large or contains many long transactions; and when applied to data streams, the performance of corresponding mining algorithm is especially crucial. To address this issue, propose a sliding window based top-k HUIs mining algorithm TOPK-SW; it first stores each batch data of current window as well as the items’ utility information to a tree called HUI-Tree, which ensures effective retrieval of utility values without re-scan the dataset, so as to efficiently improve the mining performance. TOPK-SW was tested on 4 classical datasets; results show that TOPK-SW outperforms existing algorithms significantly in both time and space efficiency, especially the time performance improves over 1 order of magnitude.
机译:现有的top-k高效项目集(HUI)挖掘算法会在挖掘过程中生成候选项目集;当数据集很大或包含许多长事务时,它们的时间和空间性能可能会受到严重影响;当应用于数据流时,相应挖掘算法的性能尤为关键。为了解决这个问题,提出了一种基于滑动窗口的前k个HUI挖掘算法TOPK-SW。它首先将当前窗口的每个批次数据以及项目的实用程序信息存储到名为HUI-Tree的树中,该树确保有效地检索实用程序值而无需重新扫描数据集,从而有效地提高了挖掘性能。 TOPK-SW在4个经典数据集上进行了测试;结果表明,TOPK-SW在时间和空间效率上均明显优于现有算法,尤其是时间性能提高了1个数量级。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号