...
首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >Heterogeneous ensemble selection for evolving data streams
【24h】

Heterogeneous ensemble selection for evolving data streams

机译:用于不断发展的数据流的异构集合选择

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Ensemble learning has been widely applied to both batch data classification and streaming data classification. For the latter setting, most existing ensemble systems are homogenous, which means they are generated from only one type of learning model. In contrast, by combining several types of different learning models, a heterogeneous ensemble system can achieve greater diversity among its members, which helps to improve its performance. Although heterogeneous ensemble systems have achieved many successes in the batch classification setting, it is not trivial to extend them directly to the data stream setting. In this study, we propose a novel HEterogeneous Ensemble Selection (HEES) method, which dynamically selects an appropriate subset of base classifiers to predict data under the stream setting. We are inspired by the observation that a well-chosen subset of good base classifiers may outperform the whole ensemble system. Here, we define a good candidate as one that expresses not only high predictive performance but also high confidence in its prediction. Our selection process is thus divided into two sub-processes: accurate-candidate selection and confident-candidate selection. We define an accurate candidate in the stream context as a base classifier with high accuracy over the current concept, while a confident candidate as one with a confidence score higher than a certain threshold. In the first subprocess, we employ the prequential accuracy to estimate the performance of a base classifier at a specific time, while in the latter sub-process, we propose a new measure to quantify the predictive confidence and provide a method to learn the threshold incrementally. The final ensemble is formed by taking the intersection of the sets of confident classifiers and accurate classifiers. Experiments on a wide range of data streams show that the proposed method achieves competitive performance with lower running time in comparison to the state-of-the-art online ensemble methods. (C) 2020 Elsevier Ltd. All rights reserved.
机译:集成学习已广泛应用于批量数据分类和流数据分类。对于后一种设置,大多数现有的集成系统是同质的,这意味着它们仅由一种类型的学习模型生成。相比之下,通过组合几种不同的学习模型,异构集成系统可以在其成员之间实现更大的多样性,这有助于提高其性能。尽管异构集成系统在批量分类设置方面取得了许多成功,但将其直接扩展到数据流设置并不是一件小事。在这项研究中,我们提出了一种新的异构集成选择(HEES)方法,该方法在流设置下动态选择适当的基本分类器子集来预测数据。我们的灵感来自这样一个观察:一个精心挑选的好的基础分类器子集可能会比整个集成系统表现更好。在这里,我们定义一个好的候选者不仅表现出很高的预测性能,而且对其预测有很高的信心。因此,我们的选择过程分为两个子过程:准确的候选人选择和自信的候选人选择。我们将流上下文中的准确候选对象定义为在当前概念上具有高准确度的基本分类器,而自信候选对象定义为置信分数高于某个阈值的分类器。在第一个子过程中,我们使用序列精度来估计基分类器在特定时间的性能,而在后一个子过程中,我们提出了一种新的度量来量化预测置信度,并提供了一种增量学习阈值的方法。最终的集成是由自信分类器集和准确分类器集的交集构成的。在大量数据流上的实验表明,与最先进的在线集成方法相比,该方法在较低的运行时间下获得了具有竞争力的性能。(C) 2020爱思唯尔有限公司版权所有。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号