Text Document Pre-Processing Using the Bayes Formula for Classification Based on the Vector Space Model

首页> 外文期刊>Computer and information science >Text Document Pre-Processing Using the Bayes Formula for Classification Based on the Vector Space Model

【24h】

Text Document Pre-Processing Using the Bayes Formula for Classification Based on the Vector Space Model

机译：Text Document Pre-Processing Using the Bayes Formula for Classification Based on the Vector Space Model

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相关主题

摘要

This work utilizes the Bayes formula to vectorize a document according to a probability distribution based on keywords reflecting the probable categories that the document may belong to. The Bayes formula gives a range of probabilities to which the document can be assigned according to a pre determined set of topics (categories). Using this probability distribution as the vectors to represent the document, the text classification algorithms based on the vector space model, such as the Support Vector Machine (SVM) and Self-Organizing Map (SOM) can then be used to classify the documents on a multi-dimensional level, thus improving on the results obtained using only the highest probability to classify the document, such as that achieved by implementing the naïve Bayes classifier by itself. The effects of an inadvertent dimensionality reduction can be overcome using these algorithms. We compare the performance of these classifiers for high dimensional data.

著录项

来源
《Computer and information science》 |2009年第4期|共1页
作者

展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种英语
中图分类
关键词
入库时间 2024-01-25 19:59:20

Text Document Pre-Processing Using the Bayes Formula for Classification Based on the Vector Space Model

摘要

著录项

相关主题

期刊订阅