首页> 外文会议>AAAI Workshop >A Comparison of Event Models for Naive Bayes Text Classification

【24h】

A Comparison of Event Models for Naive Bayes Text Classification

机译：Naive Bayes文本分类事件模型的比较

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Recent work in text classification has used two different first-order probabilistic models for classification, both of which make the naive Bayes assumption. Some use a multi-variate Bernoulli model, that is, a Bayesian Network with no dependencies between words and binary word features (e.g. Larkey and Croft 1996; Koller and Sahami 1997). Others use a multinomial model, that is, a uni-gram language model with integer word counts (e.g. Lewis and Gale 1994; Mitchell 1997). This paper aims to clarify the confusion by describing the differences and details of these two models, and by empirically comparing their classification performance on five text corpora. We find that the multi-variate Bernoulli performs well with small vocabulary sizes, but that the multinomial performs usually performs even better at larger vocabulary sizes-providing on average a 27% reduction in error over the multi-variate Bernoulli model at any vocabulary size.

机译：最近的文本分类工作已经使用了两种不同的一阶概率模型来分类，这两者都使得天真贝叶斯的假设。一些使用多变化的Bernoulli模型，即贝叶斯网络，没有单词和二进制单词特征之间没有依赖性（例如Larkey和Croft 1996; Koller和Sahami 1997）。其他人使用多项式模型，即具有整数字数的Uni-gram语言模型（例如Lewis和Gale 1994; Mitchell 1997）。本文旨在通过描述这两种模型的差异和细节，并通过在五个文本语料库上统一地比较其分类性能来阐明混淆。我们发现多变型Bernoulli以小词汇量尺寸良好地表现良好，但多项式执行通常在更大的词汇量尺寸方面表现更好 - 在任何词汇量大小的多变化Bernoulli模型中的误差平均值27％的误差下降。

著录项

来源
《AAAI Workshop》|1998年||共8页
会议地点
作者
Andrew McCallum; Kamal Nigam;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词

相似文献

外文文献
中文文献
专利

1. Topic Document Model Approach for Naive Bayes Text Classification [J] . Sang-Bum KIM, Hae-Chang RIM, Jin-Dong KIM IEICE Transactions on Information and Systems . 2005,第5期

机译：朴素贝叶斯文本分类的主题文档模型方法
2. Integrating associative rule-based classification with Naive Bayes for text classification [J] . Hadi Wael, Al-Radaideh Qasem A., Alhawari Samer Applied Soft Computing . 2018,第期

机译：将基于关联规则的分类与Naive Bayes集成进行文本分类
3. APPLICATION OF NEURAL NETWORK ALGORITHMS AND NAIVE BAYES FOR TEXT CLASSIFICATION [J] . VADYM S. YAREMENKO, WALERY S. ROGOZA, VLADYSLAV I. SPITKOVSKYI Journal of Theoretical and Applied Information Technology . 2021,第1期

机译：神经网络算法应用于文本分类的神经网络算法
4. A Comparison of Event Models for Naive Bayes Text Classification [C] . Andrew McCallum, Kamal Nigam AAAI Workshop . 1998

机译：Naive Bayes文本分类事件模型的比较
5. Modern Considerations for the Use of Naive Bayes in the Supervised Classification of Genetic Sequence Data [D] . Lakin, Steven M. 2021

机译：在遗传序列数据监督分类中使用Naive Bayes的现代考虑因素
6. Naive Bayes classifiers for verbal autopsies: comparison to physician-based classification for 21000 child and adult deaths [O] . Pierre Miasnikof, Vasily Giannakeas, Mireille Gomes, 2015

机译：朴素贝叶斯言语尸检分类器：与基于医师的21000名儿童和成人死亡分类比较
7. Comparison of Naive Bayes, Random Forest, Decision Tree, Support Vector Machines, and Logistic Regression Classifiers for Text Reviews Classification [O] . Tomas Pranckevičius, Virginijus Marcinkevičius 2017

机译：Naive Bayes，随机森林，决策树，支持向量机和文本评论分类的逻辑回归分类器的比较

A Comparison of Event Models for Naive Bayes Text Classification

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅