首页> 外文期刊>ACM Transactions on Information Systems >A Maximal Figure-of-Merit (MFoM)-Learning Approach to Robust Classifier Design for Text Categorization
【24h】

A Maximal Figure-of-Merit (MFoM)-Learning Approach to Robust Classifier Design for Text Categorization

机译:用于文本分类的鲁棒分类器设计的最大品质因数(MFoM)学习方法

获取原文
获取原文并翻译 | 示例

摘要

We propose a maximal fugure-of-merit (MFoM)-learning approach for robust classifier design, which directly optimizes performance metrics of interest for different target classifiers. The proposed approach, embedding the decision functions of classifiers and performance metrics into an overall training objective, learns the parameters of classifiers in a decision-feedback manner to effectively take into account both positive and negative training samples, thereby reducing the required size of positive training data. It has three desirable properties: (a) it is a performance metric, oriented learning; (b) the optimized metric is consistent in both training and evaluation sets; and (c) it is more robust and less sensitive to data variation, and can handle insufficient training data scenarios. We evaluate it on a text categorization task using the Reuters-21578 dataset. Training an F_1-based binary tree classifier using MFoM, we observed significantly improved performance and enhanced robustness compared to the baseline and SVM, especially for categories with insufficient training samples. The generality for designing other metrics-based classifiers is also demonstrated by comparing precision, recall, and F_1-based classifiers. The results clearly show consistency of performance between the training and evaluation stages for each classifier, and MFoM optimizes the chosen metric.
机译:我们为稳健的分类器设计提出了一种最大的择优学习(MFoM)学习方法,该方法可以针对不同的目标分类器直接优化感兴趣的性能指标。所提出的方法将分类器的决策功能和绩效指标嵌入到总体训练目标中,以决策反馈的方式学习分类器的参数,以有效地考虑正面和负面训练样本,从而减少正面训练所需的大小数据。它具有三个理想的属性:(a)它是一种绩效指标,面向学习; (b)优化的指标在培训和评估中都一致; (c)它更健壮,对数据变化不那么敏感,并且可以处理不足的训练数据场景。我们使用Reuters-21578数据集对文本分类任务进行评估。使用MFoM训练基于F_1的二叉树分类器,与基线和SVM相比,我们观察到了显着改善的性能和增强的鲁棒性,尤其是对于训练样本不足的类别。通过比较精度,召回率和基于F_1的分类器,还展示了设计其他基于指标的分类器的一般性。结果清楚地显示了每个分类器在训练和评估阶段之间的性能一致性,并且MFoM优化了所选指标。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号