Building a New Classifier in an Ensemble Using Streaming Unlabeled Data

机译：使用流式未标记数据在集成中构建新分类器

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

It is expensive and impractical to manually label all samples in real-world streaming data when the correct class is not available in real time. In this paper, we propose an ensemble method of determining which samples should be labeled from streaming unlabeled data and when they will be labeled according to changes in distribution of streaming unlabeled data. In particular, the labeling point in time is an important factor for building an efficient ensemble in practical aspects. In order to evaluate the performance of our ensemble method, we used synthetic streaming data with concept drift and the intrusion detection data from the KDD'99 Cup. We compared the results of the proposed method and those of the existing ensemble methods that periodically build new classifiers for an ensemble. In the synthetic streaming data, the proposed method produced average 14.1% higher classification accuracy, and the number of new classifiers reduced by average 12.6%. With the intrusion detection data, our method produced similar accuracy to existing methods but used only 0.007% of the labeled streaming data.

机译：当无法实时获得正确的类时，手动标记现实流数据中的所有样本既昂贵又不切实际。在本文中，我们提出了一种整体方法，该方法根据流式未标记数据的分布变化来确定应从流式传输未标记数据中标记哪些样本以及何时对它们进行标记。特别地，标记时间点是在实践方面建立有效整体的重要因素。为了评估集成方法的性能，我们使用了带有概念漂移的合成流数据和来自KDD'99 Cup的入侵检测数据。我们比较了所提出的方法和现有集成方法的结果，这些方法定期为集成创建新的分类器。在合成流数据中，该方法产生的分类准确率平均提高了14.1％，新分类器的数量平均减少了12.6％。利用入侵检测数据，我们的方法产生了与现有方法相似的准确性，但仅使用了0.007％的标记流数据。

著录项

来源
《IEA/AIE 2010;International conference on industrial engineering and other applications of applied intelligent systems》|2010年|p.77-86|共10页
会议地点
作者
Mehmed Kantardzic; Joung Woo Ryu; Chamila Walgampaya;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词
ensemble; unlabeled data; streaming data; concept drift;

机译：合奏;未标记的数据;流数据;概念漂移;

相似文献

外文文献
中文文献
专利

1. Dynamic classifier ensemble for positive unlabeled text stream classification [J] . Shirui Pan, Yang Zhang, Xue Li Knowledge and information systems . 2012,第2期

机译：动态分类器集成，用于积极的未标记文本流分类
2. Dynamic classifier ensemble for positive unlabeled text stream classification [J] . Shirui Pan, Yang Zhang, Xue Li Knowledge and Information Systems . 2012,第2期

机译：动态分类器集成，用于积极的未标记文本流分类
3. Recurring Drift Detection and Model Selection-Based Ensemble Classification for Data Streams with Unlabeled Data [J] . Peipei Li, Man Wu, Junhong He, New Generation Computing . 2021,第2期

机译：具有未标记数据的数据流的重复漂移检测和基于模型选择的集合分类
4. Building a New Classifier in an Ensemble Using Streaming Unlabeled Data [C] . Mehmed Kantardzic, Joung Woo Ryu, Chamila Walgampaya International Conference on Industrial Engineering and Other Applications of Applied Intelligent Systems . 2010

机译：使用Streaming未标记的数据构建集合中的新分类器
5. Performance Envelopes of Adaptive Ensemble Data Stream Classifiers [D] . Joe-Yen, Stefan. 2017

机译：自适应集成数据流分类器的性能包络
6. Positive-unlabeled ensemble learning for kinase substrate prediction from dynamic phosphoproteomics data [O] . Pengyi Yang, Sean J. Humphrey, David E. James, -1

机译：从动态磷酸蛋白质组学数据预测激酶底物的正无标记集成学习
7. An Online Variational Inference and Ensemble Based Multi-label Classifier for Data Streams [O] . Thi Thu Thuy Nguyen, Tien Thanh Nguyen, Alan Wee-Chung Liew, 2019

机译：用于数据流的在线变分推理和基于组合的多标签分类器

Building a New Classifier in an Ensemble Using Streaming Unlabeled Data

摘要

著录项

相似文献

相关主题

期刊订阅