首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >Adaptive Chunk-Based Dynamic Weighted Majority for Imbalanced Data Streams With Concept Drift
【24h】

Adaptive Chunk-Based Dynamic Weighted Majority for Imbalanced Data Streams With Concept Drift

机译:基于自适应块的动态加权大多数,用于概念漂移的不平衡数据流

获取原文
获取原文并翻译 | 示例

摘要

One of the most challenging problems in the field of online learning is concept drift, which deeply influences the classification stability of streaming data. If the data stream is imbalanced, it is even more difficult to detect concept drifts and make an online learner adapt to them. Ensemble algorithms have been found effective for the classification of streaming data with concept drift, whereby an individual classifier is built for each incoming data chunk and its associated weight is adjusted to manage the drift. However, it is difficult to adjust the weights to achieve a balance between the stability and adaptability of the ensemble classifiers. In addition, when the data stream is imbalanced, the use of a size-fixed chunk to build a single classifier can create further problems; the data chunk may contain too few or even no minority class samples (i.e., only majority class samples). A classifier built on such a chunk is unstable in the ensemble. In this article, we propose a chunk-based incremental learning method called adaptive chunk-based dynamic weighted majority (ACDWM) to deal with imbalanced streaming data containing concept drift. ACDWM utilizes an ensemble framework by dynamically weighting the individual classifiers according to their classification performance on the current data chunk. The chunk size is adaptively selected by statistical hypothesis tests to access whether the classifier built on the current data chunk is sufficiently stable. ACDWM has four advantages compared with the existing methods as follows: 1) it can maintain stability when processing nondrifted streams and rapidly adapt to the new concept; 2) it is entirely incremental, i.e., no previous data need to be stored; 3) it stores a limited number of classifiers to ensure high efficiency; and 4) it adaptively selects the chunk size in the concept drift environment. Experiments on both synthetic and real data sets containing concept drift show that ACDWM outperforms both state-of-the-art chunk-based and online methods.
机译:在线学习领域中最具挑战性的问题之一是概念漂移,深入影响流数据的分类稳定性。如果数据流是不平衡的,则检测概念漂移并使在线学习者适应它们是更困难的。已经发现集合算法对于具有概念漂移的流数据的分类有效,由此为每个输入数据块构建各个分类器,并且调整其相关的权重以管理漂移。然而,难以调节权重,以在集合分类器的稳定性和适应性之间实现平衡。另外,当数据流不平衡时,使用尺寸固定的块以构建单个分类器可以产生进一步的问题;数据块可能包含太少甚至没有少数类样本(即,只有大多数类样本)。在该组件上构建的分类器在集合中不稳定。在本文中,我们提出了一种基于块的增量学习方法,称为自适应块的动态加权多数(ACDWM),以处理包含概念漂移的不平衡流数据。 ACDWM通过根据当前数据块的分类性能动态加权各个分类器来利用集合框架。通过统计假设测试自适应地选择块大小,以访问是否在当前数据块上构建的分类器是足够稳定的。 ACDWM与现有方法相比有四个优点:1)在处理非加工流的流并迅速适应新概念时,它可以保持稳定; 2)完全是增量的,即,不需要存储先前的数据; 3)它存储有限数量的分类器,以确保高效率; 4)它自适应地选择概念漂移环境中的块大小。涵盖概念漂移的合成和实际数据集的实验表明,ACDWM优于最先进的基于块和在线方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号