首页> 中文期刊> 《计算机应用》 >基于规则与统计的Web突发事件新闻多层次分类

基于规则与统计的Web突发事件新闻多层次分类

     

摘要

为了适应Web新闻以指数趋势增长,传播迅速,且Web突发事件新闻在互联网上散布等特点,同时针对传统文本分类方法准确率和效率低,寻找特定主题的突发事件新闻信息难等问题,提出一种基于规则与统计相结合的Web突发事件新闻多层次自动分类方法.首先提取类别关键词形成规则库,然后利用分类规则将突发事件分成四大类,再用朴素贝叶斯分类方法将各大类突发事件新闻进行细分,从而形成了基于规则与统计的两层分类模型.实验结果表明,该分类方法的准确率和召回率都达到90%以上,分类效率也普遍高于传统的分类方法.%The Web news grows in index tendency and disseminates rapidly, and the Web emergency news widely spreads on the Internet. While the traditional text classification is of low accuracy and efficiency, it is difficult to locate the emergency news and information of specific topics. The paper proposed a multiple-layer classification method for Web emergency news based on the rules and statistics. First, it extracted category keywords to form the library of rules. Second, the emergencies would be classified into four major categories by the rules, and then these major categories would be classified into small categories by the Bayesian classification method, thus a two-tier classification model based on rules and statistics was established. The experimental results show that the classification accuracy rate and the recall rate have reached over 90%, and the classification efficiency is generally higher than the traditional classification methods.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号