首页> 外文会议>International Symposium on Chinese Spoken Language Processing >Syllable-Based Acoustic Modeling With Lattice-Free MMI for Mandarin Speech Recognition
【24h】

Syllable-Based Acoustic Modeling With Lattice-Free MMI for Mandarin Speech Recognition

机译:基于音节的声学建模与无格式MMI用于普通话语音识别

获取原文

摘要

Most automatic speech recognition (ASR) systems in past decades have used context-dependent (CD) phones as the fundamental acoustic units. However, these phone-based approaches lack an easy and efficient way for modeling long-term temporal dependencies. Compared with phone units, syllables span a longer time, typically several phones, thereby having more stable acoustic realizations. In this work, we aim to train a syllable-based acoustic model for Mandarin ASR with lattice-free maximum mutual information (LF-MMI) criterion. We expect that, the combination of longer linguistic units, the RNN-based model structure and the sequence-level objective function, can result in better modeling of long-term temporal acoustic variations. We make multiple modifications to improve the performance of syllable-based AM and benchmark our models on two large-scale databases. Experimental results show that the proposed syllable-based AM performs much better than the CD phone-based baseline, especially on noisy test sets, with faster decoding speed.
机译:过去几十年中的大多数自动语音识别(ASR)系统使用上下文相关(CD)手机作为基本声学单元。然而,这些基于电话的方法缺乏用于建模长期时间依赖性的简单有效的方法。与电话单元相比,音节跨越了更长的时间,通常是几个电话,从而具有更稳定的声学实现。在这项工作中,我们的目标是使用无格式的最大互信息(LF-MMI)标准来培训一个基于音节的声学模型。我们预期,长期语言单位的组合,基于RNN的模型结构和序列级目标函数,可以导致更好的长期时间声学变化建模。我们多次修改以提高基于音节的am和在两个大规模数据库上的模型的性能。实验结果表明,所提出的基于音节的AM比基于CD电话的基线更好,特别是在嘈杂的测试集上,具有更快的解码速度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号