首页> 外文期刊>Computer Engineering and Intelligent Systems >Positive Unlabeled Learning Algorithm for One Class Classification of Social Text Stream with only very few Positive Training Samples
【24h】

Positive Unlabeled Learning Algorithm for One Class Classification of Social Text Stream with only very few Positive Training Samples

机译:仅有很少的积极训练样本的社会文本流的一类分类的正面无标签学习算法

获取原文
       

摘要

Text classification using a small labelled set (positive data set) and large unlabeled data is seen as a promising technique especially in case of text stream classification where it is highly possible that only few positive data and no negative data is available. This paper studies how to devise a positive and unlabeled learning technique for the text stream environment. Our proposed approach works in two steps. Firstly we use the PNLH (Positive example and negative example labelling heuristic) approach for extracting both positive and negative example from unlabeled data. This extraction would enable us to obtain an enriched vector representation for the new test messages. Secondly we construct a one class classifier by using one class SVM classifier. Using the enriched vector representation as the input in one class SVM classifier predicts the importance level of each text message.
机译:使用较小的标记集(正数据集)和较大的未标记数据进行文本分类被视为一种有前途的技术,尤其是在文本流分类的情况下,极有可能只有很少的正数数据而没有负数数据可用。本文研究如何为文本流环境设计一种积极的,没有标签的学习技术。我们提出的方法分两步进行。首先,我们使用PNLH(正例和负例标记启发式)方法从未标记的数据中提取正例和负例。这种提取将使我们能够为新的测试消息获得丰富的矢量表示。其次,通过使用一类SVM分类器构造一个一类分类器。使用丰富的矢量表示作为一类SVM分类器中的输入,可以预测每个文本消息的重要性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号