首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Long short-term memory language models with additive morphological features for automatic speech recognition
【24h】

Long short-term memory language models with additive morphological features for automatic speech recognition

机译:长时记忆语言模型,具有可自动语音识别的附加形态特征

获取原文

摘要

Abstract Models of morphologically rich languages suffer from data sparsity when words are treated as atomic units. Word-based language models cannot transfer knowledge from common word forms to rarer variant forms. Learning a continuous vector representation of each morpheme allows a compositional model to represent a word as the sum of its constituent morphemes' vectors. Rare and unknown words containing common morphemes can thus be represented with greater fidelity despite their sparsity. Our novel neural network language model integrates this additive morphological representation into a long short-term memory architecture, improving Russian speech recognition word error rates by 0.9 absolute, 4.4% relative, compared to a robust n-gram baseline model.
机译:当将单词视为原子单位时,形态丰富的语言模型会遭受数据稀疏性的困扰。基于单词的语言模型无法将知识从常见的单词形式转移到较罕见的变体形式。学习每个语素的连续向量表示可以使组成模型将单词表示为其组成语素向量的总和。因此,尽管包含稀有词,但包含普通语素的稀有词和未知词却能以更高的保真度表示出来。我们的新型神经网络语言模型将该加法形态表示形式集成到了一个长期的短期记忆体系结构中,与健壮的n-gram基线模型相比,俄语语音识别词的错误率提高了0.9个绝对值,相对误差为4.4%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号