首页> 外文期刊>Expert Systems with Application >Sliding window based weighted maximal frequent pattern mining over data streams
【24h】

Sliding window based weighted maximal frequent pattern mining over data streams

机译:基于滑动窗口的加权最大频繁模式数据流挖掘

获取原文
获取原文并翻译 | 示例

摘要

As data have been accumulated more quickly in recent years, corresponding databases have also become huger, and thus, general frequent pattern mining methods have been faced with limitations that do not appropriately respond to the massive data. To overcome this problem, data mining researchers have studied methods which can conduct more efficient and immediate mining tasks by scanning databases only once. Thereafter, the sliding window model, which can perform mining operations focusing on recently accumulated parts over data streams, was proposed, and a variety of mining approaches related to this have been suggested. However, it is hard to mine all of the frequent patterns in the data stream environment since generated patterns are remarkably increased as data streams are continuously extended. Thus, methods for efficiently compressing generated patterns are needed in order to solve that problem. In addition, since not only support conditions but also weight constraints expressing items' importance are one of the important factors in the pattern mining, we need to consider them in mining process. Motivated by these issues, we propose a novel algorithm, weighted maximal frequent pattern mining over data streams based on sliding window model (WMFP-SW) to obtain weighted maximal frequent patterns reflecting recent information over data streams. Performance experiments report that MWFP-SW outperforms previous algorithms in terms of runtime, memory usage, and scalability.
机译:近年来,随着数据的积累速度越来越快,相应的数据库也变得越来越庞大,因此,通用的频繁模式挖掘方法面临着无法适当响应海量数据的局限性。为了克服这个问题,数据挖掘研究人员研究了仅扫描数据库一次即可执行更高效和即时挖掘任务的方法。此后,提出了一种滑动窗口模型,该模型可以执行集中于数据流上最近累积的部分的挖掘操作,并且提出了与此有关的多种挖掘方法。但是,由于随着数据流的不断扩展,生成的模式会显着增加,因此很难挖掘数据流环境中的所有常见模式。因此,需要用于有效压缩产生的图案的方法以解决该问题。此外,由于不仅支持条件而且表示项目重要性的权重约束是模式挖掘的重要因素之一,因此在挖掘过程中需要考虑它们。受这些问题的启发,我们提出了一种新颖的算法,即基于滑动窗口模型(WMFP-SW)的数据流加权最大频繁模式挖掘,以获得反映数据流最新信息的加权最大频繁模式。性能实验报告说,MWFP-SW在运行时,内存使用和可伸缩性方面优于以前的算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号