首页> 外文会议>International Conference on Natural Language Processing and Chinese Computing >A News Headlines Classification Method Based on the Fusion of Related Words
【24h】

A News Headlines Classification Method Based on the Fusion of Related Words

机译:一种基于相关词汇融合的新闻标题分类方法

获取原文

摘要

Short text classification is a challenging work as a result of several words, usually fewer than 20 words, in each text which brings about a problem of feature sparsity. In this paper, we propose a method of extending short text to cope with the problem of data sparsity. Additionally, we combine extension of short text, which forms a new representation with the word vector of each word in the short text trained by word2vec model on large-scale corpus. Furthermore, the new representation works as input for neural bag-of-words (NBOW) model. We evaluate this method on NLPCC 2017 Evaluation Task 2. The experimental results show that extension of short text extension with NBOW model outperforms baselines and can achieve excellent performance on the news headline classification task.
机译:短文本分类是由于几个单词,通常少于20个单词,在每个文本中具有较少的问题,这是一种带来特征稀疏的问题。在本文中,我们提出了一种延长短文本以应对数据稀疏问题的方法。此外,我们结合了短文本的扩展,这与由大规模语料库上的Word2Vec模型训练的短文本中的每个单词的单词矢量形成新的表示。此外,新的代表用作神经袋(弓)模型的输入。我们在NLPCC 2017年评估任务2上评估此方法。实验结果表明,使用弓形模型延伸短文本扩展优于基线,可以在新闻标题分类任务上实现出色的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号