首页> 外文会议>International Conference on Machine Learning and Computing >An Investigation on Linear SVM and its Variants for Text Categorization
【24h】

An Investigation on Linear SVM and its Variants for Text Categorization

机译:线性SVM及其文本分类变体的研究

获取原文
获取外文期刊封面目录资料

摘要

Linear Support Vector Machines (SVMs) have been used successfully to classify text documents into set of concepts. With the increasing number of linear SVM formulations and decomposition algorithms publicly available, this paper performs a study on their efficiency and efficacy for text categorization tasks. Eight publicly available implementations are investigated in terms of Break Even Point (BEP), F1 measure, ROC plots, learning speed and sensitivity to penalty parameter, based on the experimental results on two benchmark text corpuses. The results show that out of the eight implementations, SVMlin and Proximal SVM perform better in terms of consistent performance and reduced training time. However being an extremely simple algorithm with training time independent of the penalty parameter and the category for which training is being done, Proximal SVM is appealing. We further investigated fuzzy proximal SVM on both the text corpuses; it showed improved generalization over proximal SVM.
机译:线性支持向量机(SVM)已成功使用,将文本文档分类为一组概念。随着越来越多的线性SVM制剂和分解算法公开可用,本文对文本分类任务的效率和功效进行了研究。基于两个基准文本语料库的实验结果,根据休息点(BEP),F1测量,ROC地块,学习速度和敏感性来调查八个公开的实施。结果表明,在八个实现中,SVMLIN和近端SVM在一致的性能和减少的训练时间方面表现更好。然而,具有独立于惩罚参数的训练时间和正在进行培训的类别的极其简单的算法,近端SVM是吸引人的。我们进一步调查了文本语料中的模糊近端SVM;它显示出改善了近端SVM的泛化。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号