首页> 外文期刊>Pattern recognition letters >Ambiguous decision trees for mining concept-drifting data streams
【24h】

Ambiguous decision trees for mining concept-drifting data streams

机译:用于挖掘概念漂移数据流的模糊决策树

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

In real world situations, explanations for the same observations may be different depending on perceptions or contexts. They may change with time especially when concept drift occurs. This phenomenon incurs ambiguities. It is useful if an algorithm can learn to reflect ambiguities and select the best decision according to context or situation. Based on this viewpoint, we study the problem of deriving ambiguous decision trees from data streams to cope with concept drift. CVFDT (Concept-adapting Very Fast Decision Tree) is one of the most well-known streaming data mining methods that can learn decision trees incrementally. In this paper, we establish a method called ambiguous CVFDT (aCVFDT), which integrates ambiguities into CVFDT by exploring multiple options at each node whenever a node is to be split. When aCVFDT is used to make class predictions, it is guaranteed that the best and newest knowledge is used. When old concepts recur, aCVFDT can immediately relearn them by using the corresponding options recorded at each node. Furthermore, CVFDT does not automatically detect occurrences of concept drift and only scans trees periodically, whereas an automatic concept drift detecting mechanism is used in aCVFDT. In our experiments, hyperplane problem and two benchmark problems from the UCI KDD Archive, namely Network Intrusion and Forest CoverType, are used to validate the performance of aCVFDT. The experimental results show that aCVFDT obtains significantly improved results over traditional CVFDT.
机译:在现实世界中,对同一观察结果的解释可能会根据感知或上下文而有所不同。它们可能随时间变化,尤其是在发生概念漂移时。这种现象引起歧义。如果算法可以学会反映歧义并根据上下文或情况选择最佳决策,则将很有用。基于这种观点,我们研究了从数据流中导出模糊决策树以应对概念漂移的问题。 CVFDT(适应概念的快速决策树)是最著名的流数据挖掘方法之一,可以逐步学习决策树。在本文中,我们建立了一种称为歧义CVFDT(aCVFDT)的方法,该方法通过在每个节点被分割时探索每个节点的多个选项,将模糊度集成到CVFDT中。当使用aCVFDT进行类别预测时,可以确保使用最新的知识。当旧概念再次出现时,aCVFDT可以使用每个节点上记录的相应选项立即重新学习它们。此外,CVFDT不会自动检测概念漂移的发生,而只会定期扫描树,而在CVFDT中使用了自动概念漂移检测机制。在我们的实验中,使用超平面问题和UCI KDD存档中的两个基准问题(即网络入侵和Forest CoverType)来验证aCVFDT的性能。实验结果表明,aCVFDT比传统的CVFDT获得了明显改善。

著录项

  • 来源
    《Pattern recognition letters》 |2009年第15期|1347-1355|共9页
  • 作者

    Jing Liu; Xue Li; Weicai Zhong;

  • 作者单位

    Institute of Intelligent Information Processing, Xidian University, Xi'an 710071, China;

    School of Information Technology and Electrical Engineering, The University of Queensland, Brisbane, Qld 4072, Australia;

    Institute of Intelligent Information Processing, Xidian University, Xi'an 710071, China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    data streams; data mining; concept drift; ambiguous decision trees; incremental learning;

    机译:数据流;数据挖掘;概念漂移模棱两可的决策树;增量学习;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号