首页> 外文会议>International Symposium on Intelligent Data Analysis >MCut: A Thresholding Strategy for Multi-label Classification
【24h】

MCut: A Thresholding Strategy for Multi-label Classification

机译:MCUT:多标签分类的阈值策略

获取原文

摘要

The multi-label classification is a frequent task in machine learning notably in text categorization. When binary classifiers are not suited, an alternative consists in using a multiclass classifier that provides for each document a score per category and then in applying a thresholding strategy in order to select the set of categories which must be assigned to the document. The common thresholding strategies, such as RCut, PCut and SCut methods, need a training step to determine the value of the threshold. To overcome this limit, we propose a new strategy, called MCut which automatically estimates a value for the threshold. This method does not have to be trained and does not need any parametrization. Experiments performed on two textual corpora, XML Mining 2009 and RCV1 collections, show that the MCut strategy results are on par with the state of the art but MCut is easy to implement and parameter free.
机译:多标签分类是在文本分类中常用的机器中的频繁任务。当二进制分类器不适合时,替代方案在于使用多键分类器,该分类器为每个文档提供每类别的分数,然后在应用阈值处理策略中以选择必须分配给文档的类别集。常见的阈值策略,例如rcut,pcut和scut方法,需要训练步骤来确定阈值的值。为了克服这一限制,我们提出了一种新的策略,称为MCUT,它自动估计阈值的值。该方法不必培训并且不需要任何参数化。在两个文本语料库中进行的实验,XML矿业2009和RCV1集合,表明MCUT策略结果与现有技术相提并论,但MCUT易于实施和免费参数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号