首页> 外文期刊>Computer and information science >Text Document Pre-Processing Using the Bayes Formula for Classification Based on the Vector Space Model
【24h】

Text Document Pre-Processing Using the Bayes Formula for Classification Based on the Vector Space Model

机译:Text Document Pre-Processing Using the Bayes Formula for Classification Based on the Vector Space Model

获取原文
获取原文并翻译 | 示例
       

摘要

This work utilizes the Bayes formula to vectorize a document according to a probability distribution based on keywords reflecting the probable categories that the document may belong to. The Bayes formula gives a range of probabilities to which the document can be assigned according to a pre determined set of topics (categories). Using this probability distribution as the vectors to represent the document, the text classification algorithms based on the vector space model, such as the Support Vector Machine (SVM) and Self-Organizing Map (SOM) can then be used to classify the documents on a multi-dimensional level, thus improving on the results obtained using only the highest probability to classify the document, such as that achieved by implementing the naïve Bayes classifier by itself. The effects of an inadvertent dimensionality reduction can be overcome using these algorithms. We compare the performance of these classifiers for high dimensional data.

著录项

  • 来源
    《Computer and information science》 |2009年第4期|共1页
  • 作者

  • 作者单位
  • 收录信息
  • 原文格式 PDF
  • 正文语种 英语
  • 中图分类
  • 关键词

  • 入库时间 2024-01-25 19:59:20
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号