首页> 外文会议>International Symposium on Knowledge Acquisition and Modeling;KAM '09 >The Application Research of Topic Word List In Text Automatic Classification
【24h】

The Application Research of Topic Word List In Text Automatic Classification

机译:主题词表在文本自动分类中的应用研究

获取原文

摘要

When the traditional text classification technologies classify academic dissertations, the dimension of extracted feature terms is high, and they can't represent the theme of thesis. it makes the efficiency is very low and the accuracy rate is not high. The topic words are small in quantity and can reflect the theme of thesis well. Accordingly, the paper proposes to extract the topic words with topic word list and uses topic words as feature terms. Then using the Bayesian Classification method classifies vast texts. The experiments show that the Bayesian Classification method using topic words as feature terms can greatly reduce the dimension and improve the efficiency of classification, when the dimension of feature terms is equivalent, the accuracy of Bayesian Classification method using topic words as feature terms is also higher than the traditional Bayesian text classification methods.
机译:传统的文本分类技术对学位论文进行分类时,提取的特征词的维数较高,不能代表论文的主题。它使效率很低并且准确率不高。主题词数量少,可以很好地反映论文的主题。因此,本文提出利用主题词列表提取主题词,并以主题词为特征词。然后使用贝叶斯分类方法对大量文本进行分类。实验表明,以主题词为特征词的贝叶斯分类方法可以大大减小维数,提高分类效率,当特征词的维数相等时,以主题词为特征词的贝叶斯分类方法的准确性也更高。比传统的贝叶斯文本分类方法要好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号