首页> 外文会议>International Symposium on Chinese Spoken Language Processing >Space-Time Residual LSTM Architechture for Distant Speech Recognition
【24h】

Space-Time Residual LSTM Architechture for Distant Speech Recognition

机译:时空残留LSTM架构,用于远程语音识别

获取原文

摘要

Long Short-Term Memory (Plain-LSTM) is efficient for acoustic modeling in automatic speech recognition systems, but their training is obstructed by the vanishing and exploding gradient issues. To alleviate the problem, the paper introduces an improved space residual LSTM (S-RES-LSTM), which uses the output before not after the LSTM projection layer as spatial shortcut connection compared to the previous RES-LSTM. Experiments for distant speech recognition on the AMI SDM show that S-RES-LSTM can reach 5% absolute WER(over) and 5.9% absolute WER (non-over) reduction than the Plain-LSTM in 9- layer in eval. It also has 0.6% absolute WER reduction than the RES-LSTM in 9-layer. To further enhance the information flow for S-RES-LSTM, the space and time residual LSTM (ST-RES-LSTM) is proposed, which adds an innovational residual connection in the temporal dimension. The experiments show that compared with the Plain-LSTM and the RES-LSTM, ST-RES-LSTM achieves 5.5% absolute WER(over) degradation and 1% absolute WER(over) reduction respectively in 9-layer in eval.
机译:长短时记忆(Plain-LSTM)对于自动语音识别系统中的声学建模非常有效,但是它们的训练由于梯度问题的消失和爆炸而受阻。为了缓解该问题,本文引入了一种改进的空间残差LSTM(S-RES-LSTM),与之前的RES-LSTM相比,该方法使用LSTM投影层之前而不是之后的输出作为空间快捷连接。在AMI SDM上进行远距离语音识别的实验表明,相比9层的Plain-LSTM,S-RES-LSTM可以达到5%的绝对WER(超过)和5.9%的绝对WER(不超过)降低。与9层RES-LSTM相比,它的WER绝对降低了0.6%。为了进一步增强S-RES-LSTM的信息流,提出了时空残差LSTM(ST-RES-LSTM),它在时间维度上增加了创新的残差连接。实验表明,与Plain-LSTM和RES-LSTM相比,ST-RES-LSTM在9层评估中分别实现5.5%的绝对WER(过量)降解和1%的绝对WER(过量)降低。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号