首页> 外文会议>IEEE Automatic Speech Recognition and Understanding Workshop >Character-Aware Attention-Based End-to-End Speech Recognition
【24h】

Character-Aware Attention-Based End-to-End Speech Recognition

机译:基于字符感知注意的端到端语音识别

获取原文
获取外文期刊封面目录资料

摘要

Predicting words and subword units (WSUs) as the output has shown to be effective for the attention-based encoder-decoder (AED) model in end-to-end speech recognition. However, as one input to the decoder recurrent neural network (RNN), each WSU embedding is learned independently through context and acoustic information in a purely data-driven fashion. Little effort has been made to explicitly model the morphological relationships among WSUs. In this work, we propose a novel character-aware (CA) AED model in which each WSU embedding is computed by summarizing the embeddings of its constituent characters using a CA-RNN. This WSU-independent CA-RNN is jointly trained with the encoder, the decoder and the attention network of a conventional AED to predict WSUs. With CA-AED, the embeddings of morphologically similar WSUs are naturally and directly correlated through the CA-RNN in addition to the semantic and acoustic relations modeled by a traditional AED. Moreover, CA-AED significantly reduces the model parameters in a traditional AED by replacing the large pool of WSU embeddings with a much smaller set of character embeddings. On a 3400 hours Microsoft Cortana dataset, CA-AED achieves up to 11.9% relative WER improvement over a strong AED baseline with 27.1% fewer model parameters.
机译:预测单词和子单词单元(WSU)作为输出已显示出对端到端语音识别中基于注意力的编解码器(AED)模型有效。但是,作为解码器递归神经网络(RNN)的一个输入,每个WSU嵌入都是通过上下文和声音信息以纯数据驱动的方式独立学习的。几乎没有做出任何努力来明确建立WSU之间的形态关系模型。在这项工作中,我们提出了一种新颖的字符感知(CA)AED模型,其中,每个WSU嵌入都是通过使用CA-RNN汇总其组成字符的嵌入来计算的。这种与WSU无关的CA-RNN与常规AED的编码器,解码器和注意网络共同训练,以预测WSU。使用CA-AED,除了通过传统AED建模的语义和声学关系外,通过CA-RNN自然而直接地将形态相似的WSU的嵌入相关联。此外,CA-AED通过用更少的字符嵌入集替换大量的WSU嵌入,大大减少了传统AED中的模型参数。在3400小时的Microsoft Cortana数据集上,在强大的AED基准上,CA-AED的相对WER改善高达11.9%,模型参数减少了27.1%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号