首页> 外文期刊>Future generation computer systems >Sliding window based weighted erasable stream pattern mining for stream data applications
【24h】

Sliding window based weighted erasable stream pattern mining for stream data applications

机译:基于滑动窗口的加权可擦除流模式挖掘,用于流数据应用

获取原文
获取原文并翻译 | 示例
       

摘要

As one of the variations in frequent pattern mining, erasable pattern mining discovers patterns with benefits lower than or equal to a user-specified threshold from a product database. Although traditional erasable pattern mining algorithms can perform their own mining operations on static mining environments, they are not suitable for dealing with dynamic data stream environments. In such dynamic data streams, algorithms have to process them immediately with only one database scan in order to consider characteristics of data stream mining. However, previous tree-based erasable pattern mining methods have difficulty in processing dynamic data streams because they need two or more database scans to construct their own tree structures. In addition, they do not also consider specific information of each item within a product database, but they need to conduct mining operations considering such additional information of the items in order to find more useful erasable pattern results. For this reason, in this paper, we propose a weighted erasable pattern mining algorithm suitable for sliding window-based data stream environments. The algorithm employs tree and list data structures for more efficient mining processes and solves the problems of previous erasable pattern mining approaches by using a sliding window-based stream processing technique and an item weight-based pattern pruning method. We compare performance of the proposed algorithm to state-of-the-art tree-based approaches with respect to various real and synthetic datasets. Experimental results show that our method is more efficient and scalable than the competitors in terms of runtime, memory, and pattern generation.
机译:作为频繁模式挖掘的一种变体,可擦除模式挖掘从产品数据库中发现具有低于或等于用户指定阈值的收益的模式。尽管传统的可擦除模式挖掘算法可以在静态挖掘环境中执行自己的挖掘操作,但它们不适合处理动态数据流环境。在这种动态数据流中,算法必须仅用一次数据库扫描就立即处理它们,以考虑数据流挖掘的特性。但是,以前的基于树的可擦除模式挖掘方法在处理动态数据流时遇到困难,因为它们需要两次或更多次数据库扫描才能构建自己的树结构。另外,他们也没有考虑产品数据库中每个项目的特定信息,但是他们需要考虑项目的此类附加信息进行挖掘操作,以便找到更有用的可擦除图案结果。因此,本文提出了一种适用于基于滑动窗口的数据流环境的加权可擦除模式挖掘算法。该算法采用树和列表数据结构来提高挖掘效率,并通过使用基于滑动窗口的流处理技术和基于项目权重的模式修剪方法解决了以前的可擦除模式挖掘方法的问题。我们将提出的算法的性能与针对各种真实和合成数据集的最新的基于树的方法进行比较。实验结果表明,在运行时,内存和模式生成方面,我们的方法比竞争对手更具效率和可伸缩性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号