首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Listen, attend and spell: A neural network for large vocabulary conversational speech recognition
【24h】

Listen, attend and spell: A neural network for large vocabulary conversational speech recognition

机译:聆听,参加和拼写:用于大词汇量会话语音识别的神经网络

获取原文

摘要

We present Listen, Attend and Spell (LAS), a neural speech recognizer that transcribes speech utterances directly to characters without pronunciation models, HMMs or other components of traditional speech recognizers. In LAS, the neural network architecture subsumes the acoustic, pronunciation and language models making it not only an end-to-end trained system but an end-to-end model. In contrast to DNN-HMM, CTC and most other models, LAS makes no independence assumptions about the probability distribution of the output character sequences given the acoustic sequence. Our system has two components: a listener and a speller. The listener is a pyramidal recurrent network encoder that accepts filter bank spectra as inputs. The speller is an attention-based recurrent network decoder that emits each character conditioned on all previous characters, and the entire acoustic sequence. On a Google voice search task, LAS achieves a WER of 14.1% without a dictionary or an external language model and 10.3% with language model rescoring over the top 32 beams. In comparison, the state-of-the-art CLDNN-HMM model achieves a WER of 8.0% on the same set.
机译:我们介绍了听力,出席和拼写(LAS),这是一种神经语音识别器,可以直接将语音转录成字符,而无需发音模型,HMM或传统语音识别器的其他组件。在LAS中,神经网络体系结构包含了声学,发音和语言模型,从而使其不仅是端对端训练的系统,而且是端对端模型。与DNN-HMM,CTC和大多数其他模型相比,LAS在给定声学序列的情况下不对输出字符序列的概率分布进行独立假设。我们的系统包含两个组件:侦听器和拼写器。侦听器是一个金字塔式递归网络编码器,它接受滤波器组频谱作为输入。拼写器是一种基于注意力的循环网络解码器,它发出以所有先前字符为条件的每个字符以及整个声音序列。在Google语音搜索任务中,如果不使用字典或外部语言模型,LAS的WER为14.1%,而对前32个波束的语言模型进行评分时,LAS的WER为10.3%。相比之下,最新的CLDNN-HMM模型在同一设备上的WER为8.0%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号