Adaptive Chunk-Based Dynamic Weighted Majority for Imbalanced Data Streams With Concept Drift

Lu Yang; Cheung Yiu-Ming; Yan Tang Yuan

首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >Adaptive Chunk-Based Dynamic Weighted Majority for Imbalanced Data Streams With Concept Drift

【24h】

Adaptive Chunk-Based Dynamic Weighted Majority for Imbalanced Data Streams With Concept Drift

机译：基于自适应块的动态加权大多数，用于概念漂移的不平衡数据流

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

One of the most challenging problems in the field of online learning is concept drift, which deeply influences the classification stability of streaming data. If the data stream is imbalanced, it is even more difficult to detect concept drifts and make an online learner adapt to them. Ensemble algorithms have been found effective for the classification of streaming data with concept drift, whereby an individual classifier is built for each incoming data chunk and its associated weight is adjusted to manage the drift. However, it is difficult to adjust the weights to achieve a balance between the stability and adaptability of the ensemble classifiers. In addition, when the data stream is imbalanced, the use of a size-fixed chunk to build a single classifier can create further problems; the data chunk may contain too few or even no minority class samples (i.e., only majority class samples). A classifier built on such a chunk is unstable in the ensemble. In this article, we propose a chunk-based incremental learning method called adaptive chunk-based dynamic weighted majority (ACDWM) to deal with imbalanced streaming data containing concept drift. ACDWM utilizes an ensemble framework by dynamically weighting the individual classifiers according to their classification performance on the current data chunk. The chunk size is adaptively selected by statistical hypothesis tests to access whether the classifier built on the current data chunk is sufficiently stable. ACDWM has four advantages compared with the existing methods as follows: 1) it can maintain stability when processing nondrifted streams and rapidly adapt to the new concept; 2) it is entirely incremental, i.e., no previous data need to be stored; 3) it stores a limited number of classifiers to ensure high efficiency; and 4) it adaptively selects the chunk size in the concept drift environment. Experiments on both synthetic and real data sets containing concept drift show that ACDWM outperforms both state-of-the-art chunk-based and online methods.

机译：在线学习领域中最具挑战性的问题之一是概念漂移，深入影响流数据的分类稳定性。如果数据流是不平衡的，则检测概念漂移并使在线学习者适应它们是更困难的。已经发现集合算法对于具有概念漂移的流数据的分类有效，由此为每个输入数据块构建各个分类器，并且调整其相关的权重以管理漂移。然而，难以调节权重，以在集合分类器的稳定性和适应性之间实现平衡。另外，当数据流不平衡时，使用尺寸固定的块以构建单个分类器可以产生进一步的问题;数据块可能包含太少甚至没有少数类样本（即，只有大多数类样本）。在该组件上构建的分类器在集合中不稳定。在本文中，我们提出了一种基于块的增量学习方法，称为自适应块的动态加权多数（ACDWM），以处理包含概念漂移的不平衡流数据。 ACDWM通过根据当前数据块的分类性能动态加权各个分类器来利用集合框架。通过统计假设测试自适应地选择块大小，以访问是否在当前数据块上构建的分类器是足够稳定的。 ACDWM与现有方法相比有四个优点：1）在处理非加工流的流并迅速适应新概念时，它可以保持稳定; 2）完全是增量的，即，不需要存储先前的数据; 3）它存储有限数量的分类器，以确保高效率; 4）它自适应地选择概念漂移环境中的块大小。涵盖概念漂移的合成和实际数据集的实验表明，ACDWM优于最先进的基于块和在线方法。

著录项

来源
《Neural Networks and Learning Systems, IEEE Transactions on》 |2020年第8期|2764-2778|共15页
作者
Lu Yang; Cheung Yiu-Ming; Yan Tang Yuan;
展开▼
作者单位

Xiamen Univ Fujian Key Lab Sensing & Comp Smart City Sch Informat Xiamen 361005 Peoples R China|Hong Kong Baptist Univ Dept Comp Sci Hong Kong Peoples R China;

Hong Kong Baptist Univ Dept Comp Sci Hong Kong Peoples R China;

City Univ Hong Kong Fac Sci & Technol UOW Coll Hong Kong Community Coll Hong Kong Peoples R China|Univ Macau Dept Comp & Informat Sci Fac Sci & Technol Macau Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Thermal stability; Learning systems; Technological innovation; Detectors; Twitter; Bagging; Predictive models; Concept drift; ensemble methods; imbalance learning; online learning;

机译：热稳定性;学习系统;技术创新;探测器;Twitter;袋装;预测模型;概念漂移;合奏方法;不平衡学习;在线学习;在线学习;

相似文献

外文文献
中文文献
专利

1. A two ensemble system to handle concept drifting data streams: recurring dynamic weighted majority [J] . Sidhu Parneeta, Bhatia M. P. S. International journal of machine learning and cybernetics . 2019,第3期

机译：一个处理概念漂移数据流的两套系统：循环动态加权多数
2. A novel online ensemble approach to handle concept drifting data streams: diversified dynamic weighted majority [J] . Sidhu Parneeta, Bhatia M. P. S. International journal of machine learning and cybernetics . 2018,第1期

机译：一种新颖的在线集成方法来处理概念漂移数据流：多样化的动态加权多数
3. An Approach for Concept Drifting Streams: Early Dynamic Weighted Majority [J] . Parneeta Dhaliwal, Ajay Kumar, Poonam Chaudhary Procedia Computer Science . 2020,第5期

机译：概念漂移流的方法：早期动态加权多数
4. Dynamic Weighted Majority for Incremental Learning of Imbalanced Data Streams with Concept Drift [C] . Yang Lu, Yiu-ming Cheung, Yuan Yan Tang International Joint Conference on Artificial Intelligence . 2019

机译：具有概念漂移的不平衡数据流的动态加权多数
5. The GC3 framework grid density based clustering for classification of streaming data with concept drift. [D] . Sethi, Tegjyot Singh. 2013

机译：基于GC3框架网格密度的聚类，用于通过概念漂移对流数据进行分类。
6. Cost-Sensitive Classification for Evolving Data Streams with Concept Drift and Class Imbalance [O] . Yange Sun, Meng Li, Lei Li, 2021

机译：具有与概念漂移和类不平衡的演化数据流的成本敏感分类
7. Dynamic Weighted Majority: A New Ensemble Method for Tracking Concept Drift [O] . Jeremy Z. Kolter, Marcus A. Maloof 2003

机译：动态加权多数：跟踪概念漂移的新集成方法

Adaptive Chunk-Based Dynamic Weighted Majority for Imbalanced Data Streams With Concept Drift

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅