首页> 外文会议>International symposium on intelligent data analysis >MCut: A Thresholding Strategy for Multi-label Classification
【24h】

MCut: A Thresholding Strategy for Multi-label Classification

机译:MCut:多标签分类的阈值策略

获取原文

摘要

The multi-label classification is a frequent task in machine learning notably in text categorization. When binary classifiers are not suited, an alternative consists in using a multiclass classifier that provides for each document a score per category and then in applying a thresholding strategy in order to select the set of categories which must be assigned to the document. The common thresholding strategies, such as RCut, PCut and SCut methods, need a training step to determine the value of the threshold. To overcome this limit, we propose a new strategy, called MCut which automatically estimates a value for the threshold. This method does not have to be trained and does not need any parametrization. Experiments performed on two textual corpora, XML Mining 2009 and RCV1 collections, show that the MCut strategy results are on par with the state of the art but MCut is easy to implement and parameter free.
机译:多标签分类是机器学习中的常见任务,特别是在文本分类中。当二进制分类器不合适时,一种替代方法是使用多分类器,该分类器为每个文档提供每个类别的分数,然后应用阈值化策略以选择必须分配给文档的类别集。常见的阈值处理策略(例如RCut,PCut和SCut方法)需要训练步骤才能确定阈值。为了克服此限制,我们提出了一种称为MCut的新策略,该策略会自动估算阈值。此方法不必经过训练,也不需要任何参数化。在两个文本语料库XML Mining 2009和RCV1集合上进行的实验表明,MCut策略的结果与最新技术水平相当,但是MCut易于实现且没有参数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号