首页> 外文会议>IEEE International Symposium on IT in Medicine Education >Building naive bayes document classifier using word clusters based on bootstrap averaging
【24h】

Building naive bayes document classifier using word clusters based on bootstrap averaging

机译:使用基于Bootstrap平均的Word Clusters构建Naive Bayes文档分类器

获取原文

摘要

Aimed to solve the problem of low classification accuracy caused by poor distribution estimation by training naive Bayes document classifier on word clusters, we build a sequential word list based on mutual information between words and their semantic cluster labels, then construct a sample set of the same size with the word list through bootstrap sampling and use the average of the corresponding parameters estimated from the sample set as the last parameter to classify unknown documents. Experiment results on benchmark document data sets show that the proposed strategy gains higher classification accuracy comparing to naive Bayes documents classifier on word clusters or on words.
机译:旨在解决通过培训Naive Bayes文档分类器在Word Clusters上造成的低分类准确性的问题,我们基于单词和他们的语义群集标签之间的相互信息构建一个顺序单词列表,然后构造一个相同的样本集尺寸与单词列表通过自举采样,并使用从样本设置为最后一个参数的相应参数的平均值来对未知文档进行分类。基准文档数据集的实验结果表明,与Word集群或单词上的天真贝叶斯文档分类器相比,拟议的策略提高了更高的分类准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号