首页> 外文期刊>IEEE Transactions on Neural Networks >Robust combination of neural networks and hidden Markov models for speech recognition
【24h】

Robust combination of neural networks and hidden Markov models for speech recognition

机译:神经网络和隐马尔可夫模型的鲁棒组合,用于语音识别

获取原文
获取原文并翻译 | 示例
           

摘要

Acoustic modeling in state-of-the-art speech recognition systems usually relies on hidden Markov models (HMMs) with Gaussian emission densities. HMMs suffer from intrinsic limitations, mainly due to their arbitrary parametric assumption. Artificial neural networks (ANNs) appear to be a promising alternative in this respect, but they historically failed as a general solution to the acoustic modeling problem. This paper introduces algorithms based on a gradient-ascent technique for global training of a hybrid ANN/HMM system, in which the ANN is trained for estimating the emission probabilities of the states of the HMM. The approach is related to the major hybrid systems proposed by Bourlard and Morgan and by Bengio, with the aim of combining their benefits within a unified framework and to overcome their limitations. Several viable solutions to the "divergence problem"-that may arise when training is accomplished over the maximum-likelihood (ML) criterion-are proposed. Experimental results in speaker-independent, continuous speech recognition over Italian digit-strings validate the novel hybrid framework, allowing for improved recognition performance over HMMs with mixtures of Gaussian components, as well as over Bourlard and Morgan's paradigm. In particular, it is shown that the maximum a posteriori (MAP) version of the algorithm yields a 46.34% relative word error rate reduction with respect to standard HMMs.
机译:最新的语音识别系统中的声学建模通常依赖于具有高斯发射密度的隐马尔可夫模型(HMM)。 HMM受内在限制,主要是由于其任意的参数假设。在这方面,人工神经网络(ANN)似乎是一个有前途的替代方法,但从历史上看,它们不能作为声学建模问题的一般解决方案。本文介绍了一种基于梯度上升技术的混合ANN / HMM混合系统全局训练算法,其中对ANN进行训练以估计HMM状态的发射概率。该方法与Bourlard和Morgan以及Bengio提出的主要混合动力系统有关,目的是在统一框架内结合其优势并克服其局限性。针对“差异问题”,提出了几种可行的解决方案,这些解决方案可能是通过最大似然(ML)标准完成训练时出现的。在不依赖说话者的情况下,通过意大利数字字符串进行连续语音识别的实验结果验证了这种新型的混合框架,从而提高了混合高斯分量的HMM以及Bourlard和Morgan范例的识别性能。特别地,示出了算法的最大后验(MAP)版本相对于标准HMM产生46.34%的相对单词错误率降低。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号