首页> 外文期刊>AI communications >An efficient closed frequent itemset miner for the MOA stream mining system
【24h】

An efficient closed frequent itemset miner for the MOA stream mining system

机译:用于MOA流采矿系统的高效封闭式频繁项目集采矿机

获取原文
获取原文并翻译 | 示例
           

摘要

Mining itemsets is a central task in data mining, both in the batch and the streaming paradigms. While robust, efficient, and well-tested implementations exist for batch mining, hardly any publicly available equivalent exists for the streaming scenario. The lack of an efficient, usable tool for the task hinders its use by practitioners and makes it difficult to assess new research in the area. To alleviate this situation, we review the algorithms described in the literature, and implement and evaluate the IncMine algorithm by Cheng, Ke and Ng [J. Intell. Inf. Syst. 31(3) (2008), 191-215] for mining frequent closed itemsets from data streams. Our implementation works on top of the MOA (Massive Online Analysis) stream mining framework to ease its use and integration with other stream mining tasks. We provide a PAC-style rigorous analysis of the quality of the output of IncMine as a function of its parameters; this type of analysis is rare in pattern mining algorithms. As a by-product, the analysis shows how one of the user-provided parameters in the original description can be removed entirely while retaining the performance guarantees. Finally, we experimentally confirm both on synthetic and real data the excellent performance of the algorithm, as reported in the original paper, and its ability to handle concept drift.
机译:在批处理和流式范式中,挖掘项目集都是数据挖掘的中心任务。尽管存在用于批处理挖掘的健壮,高效且经过测试的实现,但对于流传输方案,几乎没有任何公开可用的等效项。缺乏有效,可用的工具来完成这项工作,阻碍了从业人员使用该工具,并使其难以评估该领域的新研究。为了缓解这种情况,我们回顾了文献中描述的算法,并由Cheng,Ke和Ng实施并评估了IncMine算法[J.智力Inf。 Syst。 31(3)(2008),191-215],用于从数据流中挖掘频繁关闭的项目集。我们的实现在MOA(大规模在线分析)流挖掘框架之上工作,以简化其使用以及与其他流挖掘任务的集成。我们根据PACIncine的参数对输出质量进行了PAC式的严格分析。这种类型的分析在模式挖掘算法中很少见。作为副产品,分析显示了如何在保留性能保证的同时完全删除原始说明中用户提供的参数之一。最后,我们在原始数据和实验数据上均通过实验证实了该算法的出色性能(如原始论文中所述)以及其处理概念漂移的能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号