首页> 中文期刊>铜仁学院学报 >基于Weka的文本分类算法性能分析

基于Weka的文本分类算法性能分析

     

摘要

According to text classification algorithm selection, we use 20 newsgroups data set to carry out simulation experiments on open source data mining software Weka. We evaluate the performance of Naive Bayes algorithms, IB1 algorithm and ZeroR algorithm comprehensively. Experimental results show that the Naïve Bayes algorithm has the highest accuracy and the ZeroR algorithm is the fastest among the three algorithms. The study has shown that the efficiency of text classification is greatly influenced by selection algorithm and the appropriate algorithm can significantly improve the accuracy of text classification.%针对文本分类算法的选择问题,使用二十新闻组数据集在开源的数据挖掘软件Weka上进行了仿真实验。通过实验结果综合评价了朴素贝叶斯算法、IB1算法和ZeroR算法的性能。实验结果表明在三种算法中朴素贝叶斯算法的准确率最高,ZeroR算法的运算速度最快。研究表明文本分类的效率受所选算法的影响较大,合适的算法可以显著地提高文本分类的准确率。

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号