首页> 外文期刊>Knowledge and information systems >Dynamic classifier ensemble for positive unlabeled text stream classification
【24h】

Dynamic classifier ensemble for positive unlabeled text stream classification

机译:动态分类器集成,用于积极的未标记文本流分类

获取原文
获取原文并翻译 | 示例
           

摘要

Most of studies on streaming data classification are based on the assumption that data can be fully labeled. However, in real-life applications, it is impractical and time-consuming to manually label the entire stream for training. It is very common that only a small part of positive data and a large amount of unlabeled data are available in data stream environments. In this case, applying the traditional streaming algorithms with straightforward adaptation to positive unlabeled stream may not work well or lead to poor performance. In this paper, we propose a Dynamic Classifier Ensemble method for Positive and Unlabeled text stream (DCEPU) classification scenarios. We address the problem of classifying positive and unlabeled text stream with various concept drift by constructing an appropriate validation set and designing a novel dynamic weighting scheme in the classification phase. Experimental results on benchmark dataset RCV1-v2 demonstrate that the proposed method DCEPU outperforms the existing LELC (Li et al. 2009b), DVS (with necessary adaption) (Tsymbal et al. in Inf Fusion 9(1):56-68, 2008), and Stacking style ensemble-based algorithm (Zhang et al. 2008b).
机译:有关流数据分类的大多数研究都是基于可以对数据进行完全标记的假设。但是,在实际应用中,手动标记整个流进行训练是不切实际且耗时的。在数据流环境中,仅一小部分阳性数据和大量未标记数据非常可用。在这种情况下,将具有直接适应性的传统流算法应用于未标记的正流可能效果不佳或导致性能下降。在本文中,我们为肯定和未标记的文本流(DCEPU)分类方案提出了一种动态分类器集成方法。我们通过构造适当的验证集并在分类阶段设计一种新颖的动态加权方案,来解决对带有各种概念偏差的正文本和未标记文本流进行分类的问题。在基准数据集RCV1-v2上的实验结果表明,所提出的方法DCEPU优于现有的LELC(Li等,2009b),DVS(具有必要的适应性)(Tsymbal等,Inf Fusion 9(1):56-68,2008) ),以及基于堆叠风格合奏的算法(Zhang et al。2008b)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号