首页> 外文期刊>Journal of signal processing systems for signal, image, and video technology >CTC Regularized Model Adaptation for Improving LSTM RNN Based Multi-Accent Mandarin Speech Recognition
【24h】

CTC Regularized Model Adaptation for Improving LSTM RNN Based Multi-Accent Mandarin Speech Recognition

机译:CTC正则化模型自适应,用于改进基于LSTM RNN的多口音普通话语音识别

获取原文
获取原文并翻译 | 示例

摘要

This paper proposes a novel regularized adaptation method to improve the performance of multi-accent Mandarin speech recognition task. The acoustic model is based on long short term memory recurrent neural network trained with a connectionist temporal classification loss function (LSTM-RNN-CTC). In general, directly adjusting the network parameters with a small adaptation set may lead to over-fitting. In order to avoid this problem, a regularization term is added to the original training criterion. It forces the conditional probability distribution estimated from the adapted model to be close to the accent independent model. Meanwhile, only the accent-specific output layer needs to be fine-tuned using this adaptation method. Experiments are conducted on RASC863 and CASIA regional accented speech corpus. The results show that the proposed method obtains obvious improvement when compared with LSTM-RNN-CTC baseline model. It also outperforms other adaptation methods.
机译:本文提出了一种新颖的正则化自适应方法,以提高普通话语音识别任务的性能。声学模型基于长期记忆递归神经网络,该神经网络通过连接性的时间分类损失函数(LSTM-RNN-CTC)进行训练。通常,以小的适配集直接调整网络参数可能会导致过度拟合。为了避免此问题,将正则项添加到原始训练准则中。它迫使从适应模型估计的条件概率分布接近于重音独立模型。同时,只需要使用此调整方法微调特定于口音的输出层即可。在RASC863和CASIA区域重音语音语料库上进行了实验。结果表明,与LSTM-RNN-CTC基线模型相比,该方法具有明显的改进。它也胜过其他适应方法。

著录项

  • 来源
  • 作者单位

    National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Science,School of Computer and Control Engineering, University of Chinese Academy of Sciences;

    National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Science;

    National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Science,School of Computer and Control Engineering, University of Chinese Academy of Sciences,CAS Center for Excellence in Brain Science and Intelligence Technology, Institute of Automation, Chinese Academy of Sciences;

    National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Science,School of Computer and Control Engineering, University of Chinese Academy of Sciences;

    National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Science;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    multi-accent; Mandarin speech recognition; LSTM-RNN-CTC; model adaptation; CTC regularization;

    机译:多口音;普通话语音识别;LSTM-RNN-CTC;模型自适应;CTC正则化;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号