Heterogeneous ensemble selection for evolving data streams

Luong Anh Vu; Nguyen Tien Thanh; Liew Alan Wee-Chung; Wang Shilin

首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >Heterogeneous ensemble selection for evolving data streams

【24h】

Heterogeneous ensemble selection for evolving data streams

机译：用于不断发展的数据流的异构集合选择

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Ensemble learning has been widely applied to both batch data classification and streaming data classification. For the latter setting, most existing ensemble systems are homogenous, which means they are generated from only one type of learning model. In contrast, by combining several types of different learning models, a heterogeneous ensemble system can achieve greater diversity among its members, which helps to improve its performance. Although heterogeneous ensemble systems have achieved many successes in the batch classification setting, it is not trivial to extend them directly to the data stream setting. In this study, we propose a novel HEterogeneous Ensemble Selection (HEES) method, which dynamically selects an appropriate subset of base classifiers to predict data under the stream setting. We are inspired by the observation that a well-chosen subset of good base classifiers may outperform the whole ensemble system. Here, we define a good candidate as one that expresses not only high predictive performance but also high confidence in its prediction. Our selection process is thus divided into two sub-processes: accurate-candidate selection and confident-candidate selection. We define an accurate candidate in the stream context as a base classifier with high accuracy over the current concept, while a confident candidate as one with a confidence score higher than a certain threshold. In the first subprocess, we employ the prequential accuracy to estimate the performance of a base classifier at a specific time, while in the latter sub-process, we propose a new measure to quantify the predictive confidence and provide a method to learn the threshold incrementally. The final ensemble is formed by taking the intersection of the sets of confident classifiers and accurate classifiers. Experiments on a wide range of data streams show that the proposed method achieves competitive performance with lower running time in comparison to the state-of-the-art online ensemble methods. (C) 2020 Elsevier Ltd. All rights reserved.

机译：集成学习已广泛应用于批量数据分类和流数据分类。对于后一种设置，大多数现有的集成系统是同质的，这意味着它们仅由一种类型的学习模型生成。相比之下，通过组合几种不同的学习模型，异构集成系统可以在其成员之间实现更大的多样性，这有助于提高其性能。尽管异构集成系统在批量分类设置方面取得了许多成功，但将其直接扩展到数据流设置并不是一件小事。在这项研究中，我们提出了一种新的异构集成选择（HEES）方法，该方法在流设置下动态选择适当的基本分类器子集来预测数据。我们的灵感来自这样一个观察：一个精心挑选的好的基础分类器子集可能会比整个集成系统表现更好。在这里，我们定义一个好的候选者不仅表现出很高的预测性能，而且对其预测有很高的信心。因此，我们的选择过程分为两个子过程：准确的候选人选择和自信的候选人选择。我们将流上下文中的准确候选对象定义为在当前概念上具有高准确度的基本分类器，而自信候选对象定义为置信分数高于某个阈值的分类器。在第一个子过程中，我们使用序列精度来估计基分类器在特定时间的性能，而在后一个子过程中，我们提出了一种新的度量来量化预测置信度，并提供了一种增量学习阈值的方法。最终的集成是由自信分类器集和准确分类器集的交集构成的。在大量数据流上的实验表明，与最先进的在线集成方法相比，该方法在较低的运行时间下获得了具有竞争力的性能。（C） 2020爱思唯尔有限公司版权所有。

著录项

来源
《Pattern Recognition: The Journal of the Pattern Recognition Society》 |2021年第1期|共16页
作者
Luong Anh Vu; Nguyen Tien Thanh; Liew Alan Wee-Chung; Wang Shilin;
展开▼
作者单位

Griffith Univ Sch Informat &

Commun Technol Brisbane Qld Australia;

Robert Gordon Univ Sch Comp Sci &

Digital Media Aberdeen Scotland;

Griffith Univ Sch Informat &

Commun Technol Brisbane Qld Australia;

Shanghai Jiao Tong Univ Sch Elect Informat &

Elect Engn Shanghai Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
Data streams; Heterogeneous ensembles; Ensemble selection;

机译：数据流;异构合奏;合奏选择;

相似文献

外文文献
中文文献
专利

1. THRFuzzy: Tangential holoentropy-enabled rough fuzzy classifier to classification of evolving data streams [J] . Jagannath E.Nalavade, T.Senthil Murugan 中南大学学报（英文版） . 2017,第008期
2. Classifying Uncertain and Evolving Data Streams with Distributed Extreme Learning Machine [J] . 韩东红, 张昕, 王国仁计算机科学技术学报（英文版） . 2015,第004期
3. Recurring Drift Detection and Model Selection-Based Ensemble Classification for Data Streams with Unlabeled Data [J] . Peipei Li, Man Wu, Junhong He, New Generation Computing . 2021,第2期

机译：具有未标记数据的数据流的重复漂移检测和基于模型选择的集合分类
4. Prediction Intervals for Granular Data Streams Based on Evolving Type-2 Fuzzy Granular Neural Network Dynamic Ensemble [J] . Liu Yang, Zhao Jun, Wang Wei, IEEE Transactions on Fuzzy Systems . 2021,第4期

机译：基于演化Type-2模糊粒状神经网络动态集合的粒度数据流预测间隔
5. Constructing accuracy and diversity ensemble using Pareto-based multi-objective learning for evolving data streams [J] . Sun Yange, Dai Honghua Neural computing & applications . 2021,第11期

机译：使用基于帕累托的多目标学习来构建精度和多样性集合，以实现数据流
6. Network of Experts: Learning from Evolving Data Streams Through Network-Based Ensembles [C] . Heitor Murilo Gomes, Albert Bifet, Philippe Fournier-Viger, International conference on neural information processing;Annual conference of Asia-Pacific Neural Network Society . 2019

机译：专家网络：通过基于网络的集成从不断发展的数据流中学习
7. Ensemble feature selection for multi-stream automatic speech recognition. [D] . Gelbart, David. 2008

机译：集成特征选择，用于多流自动语音识别。
8. A Distributed Stream Processing Middleware Framework for Real-Time Analysis of Heterogeneous Data on Big Data Platform: Case of Environmental Monitoring [O] . Adeyinka Akanbi, Muthoni Masinde 2020

机译：大数据平台上异构数据实时分析的分布式流处理中间件框架：环境监测案例
9. Online Ensemble Learning of Data Streams with Gradually Evolved Classes [O] . Sun, Y., Tang, K., Minku, Leandro Lei, 2016

机译：具有逐步演化的类的数据流的在线集成学习
10. Integrated Framework to Access and Mine Distributed Heterogeneous Data Streams with Uncertainty. [R] . Zhang, K. 2015

机译：不确定性访问和挖掘分布式异构数据流的集成框架。

Heterogeneous ensemble selection for evolving data streams

摘要

著录项

相似文献

相关主题

期刊订阅