首页> 外文期刊>Knowledge-Based Systems >Efficient frequent itemset mining methods over time-sensitive streams
【24h】

Efficient frequent itemset mining methods over time-sensitive streams

机译:在时间敏感的流上高效的频繁项集挖掘方法

获取原文
获取原文并翻译 | 示例

摘要

Stream data arrives dynamically and rapidly, and the characteristics cannot be reflected by the traditional transaction-based sliding window; thus, the mining results are inaccurate. This paper focuses on this problem and constructs a timestamp-based sliding window model, which can be further converted into a transaction-based sliding window. Based on this model, an extended enumeration tree is developed to incrementally maintain the essential information. In our proposed frequent itemset mining algorithm, we introduce the type transforming bound to dynamically classify the itemsets into categories; thus, certain itemset processing can be deferred or ignored, that is, an itemset will not be handled unless its type transforming bounds reach a threshold; as a result, the computational pruning can be conducted. Nevertheless, it only guarantees the conditions to obtain accurate results, and thus cannot achieve the best performance. This problem is further improved in our approximate mining algorithm, in which we propose a heuristic rule-based strategy. Additionally, it can save more computational cost with a tolerable mining error. Theoretical analysis and experimental studies demonstrate that our proposed algorithms have high accuracy, spend less computational time and memory, and significantly outperform the baseline method and state-of-the-art algorithms.
机译:流数据是动态,快速到达的,其特征无法通过传统的基于事务的滑动窗口反映出来;因此,开采结果不准确。本文针对此问题,构建了基于时间戳的滑动窗口模型,该模型可以进一步转换为基于事务的滑动窗口。基于此模型,开发了一个扩展的枚举树,以增量方式维护基本信息。在我们提出的频繁项集挖掘算法中,我们引入了类型转换绑定以将项集动态分类为类别。因此,可以推迟或忽略某些项集的处理,即,除非项集的类型转换范围达到阈值,否则将不对其进行处理;结果,可以进行计算修剪。但是,它仅保证获得准确结果的条件,因此无法获得最佳性能。该问题在我们的近似挖掘算法中得到了进一步改善,在该算法中,我们提出了一种基于启发式规则的策略。另外,它可以节省更多的计算成本,并且具有可容忍的挖掘错误。理论分析和实验研究表明,我们提出的算法具有很高的准确性,花费的计算时间和内存更少,并且显着优于基线方法和最新算法。

著录项

  • 来源
    《Knowledge-Based Systems》 |2014年第1期|281-298|共18页
  • 作者单位

    School of Information, Central University of Finance and Economics, Beijing 100081, China;

    School of Information, Central University of Finance and Economics, Beijing 100081, China;

    School of Information, Central University of Finance and Economics, Beijing 100081, China;

    School of Information, Central University of Finance and Economics, Beijing 100081, China;

    School of Information, Central University of Finance and Economics, Beijing 100081, China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Stream; Frequent itemset; Data mining; Association rules; Time-sensitive;

    机译:流;频繁项集;数据挖掘;协会规则;对时间敏感的;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号