首页> 外文会议>Information and Communications Technology, 2005. Enabling Technologies for the New Knowledge Society: ITI 3rd International Conference on >Research and Realization of Naive Bayes English Text Classification Method Based on Base Noun Phrase Identification
【24h】

Research and Realization of Naive Bayes English Text Classification Method Based on Base Noun Phrase Identification

机译:基于基础名词短语识别的朴素贝叶斯英语文本分类方法的研究与实现

获取原文

摘要

To more advance classification accuracy of English texts, Naïve Bayes method based on base noun phrase (BaseNP) identification is presented. The rising maximum entropy model is applied to the identification. Firstly, use training corpus and user-defined feature templates to generate candidate features. Secondly, the feature selection algorithm computing feature gains is applied to select features. Finally, at the parameter estimation stage, the improved iterative scaling (IIS) algorithm is adopted. The experimental results show that this technique achieved precision and recall rates of roughly 93% for BaseNP identification and the classification accuracy is remarkably improved on this basis. It indicates that shallow parsing of high accuracy is very helpful to text classification.
机译:为了提高英语文本的分类准确性,提出了一种基于基础名词短语(BaseNP)识别的朴素贝叶斯方法。上升的最大熵模型被应用于识别。首先,使用训练语料库和用户定义的特征模板来生成候选特征。其次,将计算特征增益的特征选择算法应用于特征选择。最后,在参数估计阶段,采用了改进的迭代缩放(IIS)算法。实验结果表明,该技术对BaseNP的识别精度和召回率约为93%,在此基础上,分类精度得到了显着提高。这表明高精度的浅层解析对文本分类非常有帮助。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号