首页> 外文会议>22nd International Conference on Computational Linguistics >An Improved Hierarchical Bayesian Model of Language for Document Classification
【24h】

An Improved Hierarchical Bayesian Model of Language for Document Classification

机译:改进的文档分类语言层次贝叶斯模型

获取原文
获取原文并翻译 | 示例

摘要

This paper addresses the fundamental problem of document classification, and we focus attention on classification problems where the classes are mutually exclusive. In the course of the paper we advocate an approximate sampling distribution for word counts in documents, and demonstrate the model's capacity to outperform both the simple multinomial and more recently proposed extensions on the classification task. We also compare the classifiers to a linear SVM, and show that provided certain conditions are met, the new model allows performance which exceeds that of the SVM and attains amongst the very best published results on the Newsgroups classification task.
机译:本文讨论了文档分类的基本问题,我们将重点放在类别相互排斥的分类问题上。在本文的过程中,我们主张对文档中的字数进行近似采样分布,并证明该模型具有优于简单多项式和最近提出的扩展分类任务的能力。我们还将分类器与线性SVM进行比较,并表明只要满足某些条件,新模型就可以实现超越SVM的性能,并且在Newsgroups分类任务中获得最佳的发布结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号