【24h】

The Strength of the Weakest Supervision: Topic Classification Using Class Labels

机译:最弱监督的优势:使用类标签进行主题分类

获取原文

摘要

When developing topic classifiers for real-world applications, we begin by defining a set of meaningful topic labels. Ideally, an intelligent classifier can understand these labels right away and start classifying documents. Indeed, a human can confidently tell if a news article is about science, politics, sports, or none of the above, after knowing just the class labels. We study the problem of training an initial topic classifier using only class labels. We investigate existing techniques for solving this problem and propose a simple but effective approach. Experiments on a variety of topic classification data sets show that learning from class labels can save significant initial labeling effort, essentially providing a "free" warm start to the topic classifier.
机译:在为实际应用程序开发主题分类器时,我们首先定义一组有意义的主题标签。理想情况下,智能分类器可以立即理解这些标签并开始对文档进行分类。的确,只要知道阶级标签,人们就可以自信地判断新闻是关于科学,政治,体育还是上述都不是。我们研究仅使用类标签来训练初始主题分类器的问题。我们研究解决该问题的现有技术,并提出一种简单而有效的方法。对各种主题分类数据集的实验表明,从类标签中学习可以节省大量的初始标记工作,从本质上为主题分类器提供了“免费的”热启动。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号