...
首页> 外文期刊>Speech Communication >Model-based feature enhancement with uncertainty decoding for noise robust ASR
【24h】

Model-based feature enhancement with uncertainty decoding for noise robust ASR

机译:基于模型的特征增强和不确定性解码,可增强噪声鲁棒ASR

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In this paper, several techniques are proposed to incorporate the uncertainty of the clean speech estimate in the decoding process of the backend recogniser in the context of model-based feature enhancement (MBFE) for noise robust speech recognition. Usually, the Gaussians in the acoustic space are sampled in a single point estimate, which means that the backend recogniser considers its input as a noise-free utterance. However, in this way the variance of the estimator is neglected. To solve this problem, it has already been argued that the acoustic space should be evaluated in a probability density function, e.g. a Gaussian observation pdf. We illustrate that this Gaussian observation pdf can be replaced by a computationally more tractable discrete pdf, consisting of a weighted sum of delta functions. We also show how improved posterior state probabilities can be obtained by calculating their maximum likelihood estimates or by using the pdf of clean speech conditioned on both the noisy speech and the backend Gaussian. Another simple and efficient technique is to replace these posterior probabilities by M Kronecker deltas, which results in M front-end feature vector candidates, and to take the maximum over their backend scores. Experimental results are given for the Aurora2 and Aurora4 database to compare the proposed techniques. A significant decrease of the word error rate of the resulting speech recognition system is obtained.
机译:在本文中,提出了几种技术,以在基于模型的特征增强(MBFE)的背景下将纯净语音估计的不确定性纳入后端识别器的解码过程中,以增强噪声鲁棒性语音识别。通常,在声学空间中的高斯信号是在单点估计中采样的,这意味着后端识别器将其输入视为无噪声话语。然而,以这种方式,估计器的方差被忽略。为了解决这个问题,已经有人提出应该用概率密度函数来评估声空间,如高斯观测pdf。我们说明,该高斯观测pdf可以被计算上更易处理的离散pdf取代,该离散pdf由增量函数的加权和组成。我们还展示了如何通过计算其最大似然估计或使用以嘈杂语音和后端高斯为条件的干净语音pdf来获得改进的后态概率。另一种简单有效的技术是用M个Kronecker三角洲代替这些后验概率,从而产生M个前端特征向量候选对象,并在其后端分数上取最大。给出了Aurora2和Aurora4数据库的实验结果,以比较所提出的技术。得到的语音识别系统的词错误率大大降低。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号