首页> 外文会议>International Conference on Computer Science and Information Processing;CSIP 2012 >Internet news headlines classification method based on the N-Gram language model
【24h】

Internet news headlines classification method based on the N-Gram language model

机译:基于N-Gram语言模型的互联网新闻标题分类方法

获取原文
获取原文并翻译 | 示例

摘要

This paper aiming at the Internet news headlines short text classification. After analysis of the traditional classification model and the characteristics of Internet news headlines, this paper presents a classification model of the N-Gram language model as the Internet news headlines. Internet news headlines classification process is divided into three modules, the preprocessing module, the training module and the prediction module. Designing a classification algorithm based on N-Gram language model Internet news headlines. The algorithm classify the Internet news headlines by calculating the probability value of the words string of unclassified and category C, while calculating the probability value it can also take into account the relevance of the previous term. So it has better classification performance.
机译:本文针对互联网新闻标题的短文本分类。在分析了传统分类模型和互联网新闻标题的特点之后,提出了一种N-Gram语言模型作为互联网新闻标题的分类模型。互联网新闻标题分类过程分为三个模块,预处理模块,训练模块和预测模块。设计基于N-Gram语言模型的互联网新闻标题分类算法。该算法通过计算未分类和类别C的单词字符串的概率值来对Internet新闻标题进行分类,而在计算概率值时,它也可以考虑上一个术语的相关性。因此具有更好的分类性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号