首页> 外文期刊>International Journal of Data Science and Analytics >BFSPMiner: an effective and efficient batch-free algorithm for mining sequential patterns over data streams
【24h】

BFSPMiner: an effective and efficient batch-free algorithm for mining sequential patterns over data streams

机译:BFSPMINER:用于在数据流上挖掘连续模式的有效和有效的无批算法

获取原文
获取原文并翻译 | 示例
           

摘要

Supporting sequential pattern mining from data streams is nowadays a relevant problem in the area of data stream mining research. Actual proposals available in the literature are based on the well-known PrefixSpan approach and are, indeed, able to effectively bound the error of discovered patterns. This approach foresees the idea of dividing the target stream in a collection of manageable chunks, i.e., pieces of stream, in order to gain into effectiveness and efficiency. Unfortunately, mining patterns from stream chunks indeed introduce additional errors with respect to the basic application scenario where the target stream is mined continuously, in a non-batch manner. This is due to several reasons. First, since batches are processed individually, patterns that contain items from two consecutive batches are lost. Secondly, in most batch-based approaches, the decision about the frequency of a pattern is done locally inside a single batch. Thus, if a pattern is frequent in the stream but its items are scattered over different batches, it will be continuously pruned out and will never become frequent due to the algorithm's lack of the "complete-picture" perspective. In order to address so-delineated pattern mining problems, this paper introduces and experimentally assesses BFSPMiner, a Batch-Free Sequential Pattern Miner algorithm for effectively and efficiently mining patterns in streams without being constrained to the traditional batch-based processing. This allows us, for instance, to discover frequent patterns that would be lost according to alternative batch-based stream mining processing models. We complement our analytical contributions by means of a comprehensive experimental campaign of BFSPMiner against real-world data stream sets and in comparison with current batch-based stream sequential pattern mining algorithms.
机译:支持从数据流的顺序模式挖掘现在是数据流挖掘研究领域的相关问题。文献中可用的实际提案基于众所周知的前缀方法,并且实际上是能够有效地绑定发现模式的错误。这种方法预测将目标流划分在可管理的块的集合中,即流,流动,以获得有效性和​​效率。遗憾的是,来自流块的挖掘模式确实引入了关于连续开采目标流的基本应用场景的额外误差,以非批量方式。这是由于几个原因。首先,由于单独处理批处理,因此包含来自两个连续批次的项目的模式丢失。其次,在基于批量的方法中,关于图案频率的决定在单个批次内本地完成。因此,如果在流中频繁频繁,但其项目散布在不同的批次上,则由于算法缺乏“完整图像”的角度,它将连续地分散出来并且永远不会频繁。为了解决如此划定的模式挖掘问题,本文介绍并通过实验评估BFSPMINER,用于有效且有效地和有效地挖掘流中的模式,而不会被限制为传统的基于批处理的处理。例如,这允许我们发现根据基于批次的流挖掘处理模型将丢失的频繁模式。我们通过针对现实世界数据流集合的全面实验活动补充我们的分析贡献,并与当前基于批量流顺序模式挖掘算法相比。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号