首页> 外文会议>Chinese Automation Congress >Short Text Classification Based on Keywords Extension
【24h】

Short Text Classification Based on Keywords Extension

机译:基于关键词扩展的短文本分类

获取原文

摘要

Due to the number of short texts in news is small, the traditional text processing method often causes the lack of semantic information when analyzing the news text, which becomes one of the bottlenecks that restrict the performance of short text classification. This paper uses the external corpus to train the Word2Vec model, expands the keywords extracted by the traditional keyword extraction algorithm based on external semantic information, and studies the feasibility of extending short text keywords based on external semantic information according to different extension methods. Finally, KNN (K-Nearest Neighbor) algorithm is used to verify that the proposed method improves the performance of short text classification in news compared with the classical algorithm, which is close to the current mainstream text classification algorithm.
机译:由于新闻中的短文本数量少,传统的文本处理方法在分析新闻文本时常常会导致语义信息的缺乏,这成为制约短文本分类性能的瓶颈之一。本文利用外部语料库对Word2Vec模型进行训练,扩展了传统的基于外部语义信息的关键词提取算法提取的关键词,并研究了根据不同的扩展方法扩展基于外部语义信息的短文本关键词的可行性。最后,通过KNN(K-Nearest Neighbor)算法验证了该方法与传统算法相比,提高了新闻中短文本分类的性能,该算法与目前的主流文本分类算法相近。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号