首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Training Deep Bidirectional LSTM Acoustic Model for LVCSR by a Context-Sensitive-Chunk BPTT Approach
【24h】

Training Deep Bidirectional LSTM Acoustic Model for LVCSR by a Context-Sensitive-Chunk BPTT Approach

机译:通过上下文敏感块BPTT方法训练LVCSR的深度双向LSTM声学模型

获取原文
获取原文并翻译 | 示例

摘要

This paper presents a study of using deep bidirectional long short-term memory (DBLSTM) recurrent neural network as acoustic model for DBLSTM-HMM based large vocabulary continuous speech recognition (LVCSR), where a context-sensitive-chunk (CSC) back-propagation through time (BPTT) approach is used to train DBLSTM by splitting each training sequence into chunks with appended contextual observations, and a CSC-based decoding method with possibly overlapped CSCs is used for recognition. Our approach makes mini-batch based training on GPU more efficient and reduces the latency of DBLSTM-based LVCSR from a whole utterance to a short chunk. Evaluations have been made on Switchboard-I benchmark task. In comparison with epoch-wise BPTT training, our method can achieve more than three times speedup on a single GPU card without degrading recognition accuracy. In comparison with a highly optimized DNN-HMM system trained by a frame-level cross entropy (CE) criterion, our CE-trained DBLSTM-HMM system achieves relative word error rate reductions of 9% and 5% on Eval2000 and RT03S testing sets, respectively. Furthermore, by running model averaging based parallel training of DBLSTM on a cluster of GPUs, CSC-BPTT incurs less accuracy degradation than epoch-wise BPTT while achieves a linear speedup.
机译:本文介绍了使用深度双向长期短期记忆(DBLSTM)递归神经网络作为基于DBLSTM-HMM的大词汇量连续语音识别(LVCSR)的声学模型的研究,其中上下文敏感块(CSC)反向传播通过时间(BPTT)方法来训练DBLSTM,方法是将每个训练序列分为带有附加上下文观察的大块,然后将基于CSC的解码方法与可能重叠的CSC一起用于识别。我们的方法使在GPU上进行基于小批量的训练更加有效,并将基于DBLSTM的LVCSR的延迟从整个发声减少到一小段。已对Switchboard-I基准测试任务进行了评估。与基于时间的BPTT训练相比,我们的方法可以在单个GPU卡上实现三倍以上的加速,而不会降低识别精度。与经过帧级交叉熵(CE)标准训练的高度优化的DNN-HMM系统相比,我们的CE训练过的D​​BLSTM-HMM系统在Eval2000和RT03S测试仪上实现了9%和5%的相对单词错误率降低,分别。此外,通过在GPU集群上运行基于模型平均的DBLSTM并行训练,CSC-BPTT的精度下降幅度要小于历时BPTT,同时实现了线性加速。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号