首页> 外文会议>IEEE International Conference on Computer and Communications >Word-Level and Character-Level Mixed Features for Chinese Short Text Classification
【24h】

Word-Level and Character-Level Mixed Features for Chinese Short Text Classification

机译:中文短文本分类的词级和字符级混合特征

获取原文

摘要

In this paper, a novel method is proposed to solve the problem of insufficient representation of single character-level features or word-level features. In view of the short length, sparseness and strong context dependencies of short text, our method takes word-level vectors and character-level vectors as inputs simultaneously, and encodes sentence semantics by two Long Short-Term Memory (LSTMs) or bidirectional Long Short-Term Memory (BiLSTMs). The outputs of the entire sentence combined two outputs from word-level vectors and character-level vectors. For Chinese short text classification, our experiments show that the combination of word embedding and character embedding can complement each other in the sentence semantic representation, which helps to improve the classification performance of Chinese short text.
机译:本文提出了一种新颖的方法来解决单个字符级特征或单词级特征表示不足的问题。鉴于短文本的短长度,稀疏性和强烈的上下文相关性,我们的方法同时将单词级向量和字符级向量作为输入,并通过两个Long Short-Term Memory(LSTM)或双向Long Short-Term编码句子语义长期记忆(BiLSTM)。整个句子的输出组合了单词级向量和字符级向量的两个输出。对于中文短文本分类,我们的实验表明,词嵌入和字符嵌入的组合可以在句子的语义表示中相互补充,从而有助于提高中文短文本的分类性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号