【24h】

Sliding-window filtering

机译:滑动窗口过滤

获取原文

摘要

We explore in this paper an effective sliding-window filtering (abbreviatedly as SWF) algorithm for incremental mining of association rules. In essence, by partitioning a transaction database into several partitions, algorithm SWF employs a filtering threshold in each partition to deal with the candidate itemset generation. Under SWF, the cumulative information of mining previous partitions is selectively carried over toward the generation of candidate itemsets for the subsequent partitions. Algorithm SWF not only significantly reduces I/O and CPU cost by the concepts of cumulative filtering and scan reduction techniques but also effectively controls memory utilization by the technique of sliding-window partition. Algorithm SWF is particularly powerful for efficient incremental mining for an ongoing time-variant transaction database. By utilizing proper scan reduction techniques, only one scan of the incremented dataset is needed by algorithm SWF. The I/O cost of SWF is, in orders of magnitude, smaller than those required by prior methods, thus resolving the performance bottleneck. Experimental studies are performed to evaluate performance of algorithm SWF. It is noted that the improvement achieved by algorithm SWF is even more prominent as the incremented portion of the dataset increases and also as the size of the database increases.
机译:我们在本文中探索了一种有效的滑动窗口滤波(缩写为SWF)算法,用于增量挖掘关联规则。本质上,通过将事务数据库划分为几个分区,算法SWF在每个分区中采用了过滤阈值来处理候选项目集的生成。在SWF下,有选择地将挖掘先前分区的累积信息转移到后续分区的候选项目集的生成中。算法SWF不仅通过累积过滤和减少扫描技术的概念显着降低了I / O和CPU成本,而且还通过滑动窗口分区技术有效地控制了内存利用率。算法SWF对于持续进行的时变交易数据库的有效增量挖掘特别强大。通过利用适当的扫描缩减技术,算法SWF仅需要对增量数据集进行一次扫描。 SWF的I / O成本比现有方法所需的I / O成本低几个数量级,从而解决了性能瓶颈。进行实验研究以评估算法SWF的性能。注意,随着数据集的增加部分的增加以及数据库大小的增加,算法SWF所实现的改进甚至更加显着。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号