首页> 外文期刊>Machine Learning >Text Categorization with Support Vector Machines. How to Represent Texts in Input Space?
【24h】

Text Categorization with Support Vector Machines. How to Represent Texts in Input Space?

机译:支持向量机的文本分类。如何在输入空间中表示文本?

获取原文
获取原文并翻译 | 示例
       

摘要

The choice of the kernel function is crucial to most applications for support vector machines. In this paper, however, we show that in the case of text classification, tem-frequency transformations have a larger impact on the performance of SVM than the kernel itself. We discuss the role of importance-weights (e. g. Document frequency and redundancy), which is not yet fully understood in the light of model complexity and calculation cost, and we show that time consuming lemmatization or stemming can be avoided even when classifying a highly inflectional language like German.
机译:对于支持向量机的大多数应用程序,内核功能的选择至关重要。但是,在本文中,我们表明在文本分类的情况下,与内核本身相比,时频转换对SVM的性能影响更大。我们讨论了重要性权重(例如文档频率和冗余度)的作用,鉴于模型的复杂性和计算成本,这还没有完全被理解,并且我们展示了即使对高度曲折的分类,也可以避免费时的词形化或词干提取。像德语一样的语言。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号