首页> 外文会议>International ACM SIGIR conference on research development in information retrieval >A Visual Tool for Bayesian Data Analysis: The Impact of Smoothing on Naieve Bayes Text Classifiers
【24h】

A Visual Tool for Bayesian Data Analysis: The Impact of Smoothing on Naieve Bayes Text Classifiers

机译:贝叶斯数据分析的可视工具:朴素贝叶斯文本分类器中平滑处理的影响

获取原文

摘要

Naive Bayes (NB) classifiers are simple probabilistic classifiers still widely used in supervised learning due to their tradeoff between efficient model training and good empirical results. One of the drawbacks of these classifiers is that in situations of data sparsity (i.e. when the size of training set is small) the maximum likelihood estimation of the probability of unseen features in these situations is equal to zero causing arithmetic anomalies. To prevent this undesirable behavior, a number of smoothing techniques have been proposed [4]. Among these, the Bayesian approach incorporates smoothing in terms of prior knowledge about the parameters of the model usually called hyper-parameters. Our research question is: can a visualization tool help researchers to quickly assess the goodness of the performance of NB classifiers by setting optimal smoothing parameters?
机译:朴素贝叶斯(NB)分类器是简单的概率分类器,由于它们在有效的模型训练和良好的经验结果之间进行权衡,因此仍广泛用于监督学习中。这些分类器的缺点之一是,在数据稀疏的情况下(即,训练集的大小较小时),在这些情况下看不见的特征的概率的最大似然估计等于零,从而导致算术异常。为了防止这种不良行为,已经提出了许多平滑技术[4]。在这些方法中,贝叶斯方法根据有关模型参数的先验知识(通常称为超参数)结合了平滑处理。我们的研究问题是:可视化工具是否可以通过设置最佳平滑参数来帮助研究人员快速评估NB分类器性能的优劣?

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号