A Novel Sampling Strategy for Active Learning over Evolving Stream Data

机译：一种新的采样策略，用于在不断发展的流数据上学习

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In classification tasks, data labeling is an expensive and time-consuming process, hence, active learning which query labels for a small representative portion of data, is becoming increasingly important. However, few works consider the challenges from data steam setting because most of the active learning method is designed for non-streaming setting. Be based upon the status quo, after synthesizing the evidence-based uncertainty sampling strategy and split sampling strategy above, we propose a new sampling strategy for active learning over evolving stream data, which can take full advantages of the strengths of each. First, the original data stream is randomly divided into two sub-streams. Instances from one sub-stream are labeled according to the high evidence-focused uncertainty strategy, while instances from the other sub-stream are marked by the random strategy for detecting true concept drifts. Second, we introduce a sliding window in the high evidence-focused uncertainty strategy, finding out whether an instance is the conflict-uncertainty instance or not. Clearly, our strategy solves the issue of the effective use of evidence in data streams setting, and can choose more representative instances over evolving data streams for training a model. Finally, in experiments over four benchmark datasets, compared with state-of-art active learning strategies, the result illustrates good predictive performance of our proposed approach.

机译：在分类任务，数据标签是昂贵且耗时的过程，因此，主动学习这对于数据的一小部分代表查询标签，正变得越来越重要。然而，作品很少考虑数据蒸汽定型的挑战，因为大多数的主动学习方法的设计用于非流设置。以其为依据的现状，综合证据为基础的不确定性采样策略和分采样上述策略后，我们提出了主动学习新的取样战略演变以上数据流，可以采取各方面的优势充分的优势。首先，原始数据流被随机分为两个子流。从一个子流实例根据高证据为中心的不确定性策略标记，而来自其它子流实例被用于检测真正的概念漂移随机策略标记。第二，我们引进的高证据为重点的战略不确定性的滑动窗口，找出一个实例是否是冲突的不确定性实例或不是。显然，我们的策略解决了有效使用数据证据的问题流设置，并在不断发展的数据训练的模型流可以选择比较有代表性的实例。最后，在四个基准数据集，与国家的最先进的主动学习策略比较实验，结果表明我们提出的方法的良好的预测性能。

著录项

来源
《International Conference on Computer Engineering, Information Science Application Technology》|2017年|509p|共7页
会议地点
作者
Xuxu Zhang; Zhi Cao; Li Peng; Siqi Ren;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP3-53;
关键词
Active learning; Data streams; Evidence; Random strategy;

机译：主动学习;数据流;证据;随机策略;

相似文献

外文文献
中文文献
专利

1. MOA Concept Drift Active Learning Strategies for Streaming Data [J] . Albert Bifet, Bernhard Pfahringer, Geoff Holmes, JMLR: Workshop and Conference Proceedings . 2011,第2011期

机译：MOA概念漂移用于数据流的主动学习策略
2. Active learning strategy for smart soft sensor development under a small number of labeled data samples [J] . Zhiqiang Ge Journal of Process Control . 2014,第9期

机译：少量标记数据样本下用于智能软传感器开发的主动学习策略
3. Partition sampling: an active learning selection strategy for large database annotation [J] . F. Souvannavong, B. Merialdo, B. Huet IEE proceedings, Part K. Vision, image and signal processing . 2005,第3期

机译：分区采样：大型数据库注释的主动学习选择策略
4. A Novel Sampling Strategy for Active Learning over Evolving Stream Data [C] . Xuxu Zhang, Zhi Cao, Li Peng, International Conference on Computer Engineering, Information Science Application Technology . 2017

机译：一种新的采样策略，用于在不断发展的流数据上学习
5. An active database system supporting rule analysis through evolving database states [D] . Ben Abdellatif, Taoufik 1999

机译：主动数据库系统通过不断发展的数据库状态支持规则分析
6. Streaming chunk incremental learning for class-wise data stream classification with fast learning speed and low structural complexity [O] . Prem Junsawang, Suphakant Phimoltares, Chidchanok Lursinsap 2012

机译：流式块增量学习，用于以快速的学习速度和较低的结构复杂度对类数据流进行分类
7. A Novel Sampling Strategy for Active Learning over Evolving Stream Data [O] . Xuxu Zhang, Zhi Cao, Li Peng, 2017

机译：一种新的采样策略，用于在不断发展的流数据上学习

A Novel Sampling Strategy for Active Learning over Evolving Stream Data

摘要

著录项

相似文献

相关主题

期刊订阅