【24h】

RLS-A reduced labeled samples approach for streaming imbalanced data with concept drift

机译:RLS-A减少标记样本的方法,用于通过概念漂移流式传输不平衡数据

获取原文

摘要

In the streaming data milieu, the input data distribution is not static and the models generated must be updated when concept drift occurs, to maintain the classification performance. Updating a model requires retraining with the new incoming labeled samples. However, labeling data is a costly and time-consuming process and designing algorithms which do not require all the instances in the stream to be labeled, is needed. In this paper, a new Reduced Labeled Samples (RLS) framework is proposed, which can handle concept drift in imbalanced data streams, by selectively labeling only those set of samples which are the most useful in characterizing the drift, and thereby generating an updated model with fewer labeled samples. Experimental comparison with state of the art imbalanced stream classification algorithms shows that the RLS framework achieves comparable or better performance with requiring only 18% of the samples to be labeled.
机译:在流数据环境中,输入数据分布不是静态的,并且在发生概念漂移时必须更新生成的模型,以保持分类性能。更新模型需要使用新的传入标记样本进行重新训练。但是,标记数据是一个昂贵且费时的过程,并且需要一种设计算法,该算法不需要标记流中的所有实例。在本文中,提出了一个新的精简标记样本(RLS)框架,该框架可通过仅标记那些最能表征漂移的样本集来处理不平衡数据流中的概念漂移,从而生成更新的模型标记样品更少。与现有技术不平衡流分类算法的实验比较表明,RLS框架仅在需要标记的样本中达到18%即可达到可比或更好的性能。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号