【24h】

A More Accurate Text Classifier for Positive and Unlabeled data

机译:用于肯定和未标记数据的更准确的文本分类器

获取原文
获取原文并翻译 | 示例

摘要

Almost all LPU algorithms rely heavily on two steps: exploiting reliable negative dataset and supplementing positive dataset. For above two steps, this paper originally proposes a two-step approach, that is, CoTrain-Active. The first step, employing CoTrain algorithm, iterates to purify the unlabeled set with two individual SVM base classifiers. The second step, adopting active-learning algorithm, further expands the positive set effectively by request the true label for the "suspect positive" examples. Comprehensive experiments demonstrate that our approach is superior to Biased-SVM which is said to be previous best. Moreover, CoTrain-Active is especially suitable for those situations where the given positive dataset P is extremely insufficient.
机译:几乎所有的LPU算法都严重依赖两个步骤:开发可靠的负数据集和补充正数据集。对于以上两步,本文最初提出了一种两步方法,即CoTrain-Active。第一步,采用CoTrain算法,使用两个单独的SVM基本分类器进行迭代以纯化未标记的集合。第二步,采用主动学习算法,通过为“可疑肯定”示例请求真实标签,进一步有效地扩展肯定集。全面的实验表明,我们的方法要优于Biased-SVM,后者据说是以前最好的。此外,CoTrain-Active特别适用于给定正数据集P非常不足的情况。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号