首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Recurrent neural network language model with structured word embeddings for speech recognition
【24h】

Recurrent neural network language model with structured word embeddings for speech recognition

机译:具有结构化词嵌入的递归神经网络语言模型用于语音识别

获取原文

摘要

Due to effective word context encoding and long-term context preserving, recurrent neural network language model (RNNLM) has attracted great interest by showing better performance over back-off n-gram models and feed-forward neural network language models (FNNLM). However, it still has the difficulty of modelling words of very low frequency in training data. To address this issue, a new framework of structured word embedding is introduced to RNNLM, where both input and target word embeddings are factorized into weighted sum of the corresponding sub-word embeddings. The framework is instantiated for Chinese, where characters can be naturally used as the sub-word units. Experiments on a Chinese twitter LVCSR task showed that the proposed approach effectively outperformed the standard RNNLM, yielding a relative PPL improvement of 8:8% and an absolute 0:59% CER improvement in N-Best re-scoring.
机译:由于有效的词上下文编码和长期的上下文保留,递归神经网络语言模型(RNNLM)通过表现出优于后退n-gram模型和前馈神经网络语言模型(FNNLM)的性能而引起了人们的极大兴趣。但是,它仍然难以在训练数据中对非常低频率的单词进行建模。为了解决这个问题,RNNLM引入了一个新的结构化词嵌入框架,其中输入词和目标词嵌入都被分解为相应子词嵌入的加权和。该框架针对中文进行了实例化,其中自然可以将字符用作子单词单元。对中国twitter LVCSR任务进行的实验表明,该方法有效地胜过了标准RNNLM,在N-Best重新评分中,PPL的相对改进为8:8%,CER的绝对值为0:59%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号