...
首页> 外文期刊>Computer speech and language >Uncertainty weighting and propagation in DNN-HMM-based speech recognition
【24h】

Uncertainty weighting and propagation in DNN-HMM-based speech recognition

机译:基于DNN-HMM的语音识别中的不确定性加权和传播

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In this paper an uncertainty weighting scheme for DNN-HMM-based speech recognition is proposed to increase discrimina-bility in the decoding process. To this end, the DNN pseudo-log-likelihoods are weighted according to the uncertainty variance assigned to the acoustic observation. The results presented here suggest that substantial reduction in WER is achieved with clean training. Moreover, modelling the uncertainty propagation through the DNN is not required and no approximations for non-linear activation functions are made. The presented method can be applied to any network topology that delivers log-likelihood-like scores. It can be combined with any noise removal technique and adds a minimal computational cost. This technique was exhaustively evaluated and combined with uncertainty-propagation-based schemes for computing the pseudo-log-likelihoods and uncertainty variance at the DNN output. Two proposed methods optimized the parameters of the weighting function by leveraging the grid search either on a development database representing the given task or on each utterance based on discrimination metrics. Experiments with Aurora-4 task showed that, with clean training, the proposed weighting scheme can reduce WER by a maximum of 21 % compared with a baseline system with spectral subtraction and uncertainty propagation using the unscented transform. The uncertainty weighting method reduced the gap between clean and multi-noise/multi-condition training. This can be useful when it is not easy to train a DNN-HMM system in conditions that are similar to the testing ones. Finally, the presented results on the use of uncertainty are very competitive with those published elsewhere using the same database as the one employed here.
机译:本文提出了一种基于DNN-HMM的语音识别不确定性加权方案,以提高解码过程的可识别性。为此,根据分配给声学观测的不确定性方差对DNN伪对数似然度进行加权。这里介绍的结果表明,通过干净的培训可以大大降低WER。此外,不需要对通过DNN传播的不确定性进行建模,并且不对非线性激活函数进行近似处理。所提出的方法可以应用于提供对数似然分数的任何网络拓扑。它可以与任何噪声消除技术结合使用,并增加了最小的计算成本。对该技术进行了详尽的评估,并与基于不确定性传播的方案相结合,用于计算DNN输出处的伪对数似然和不确定性方差。两种提出的方​​法通过利用网格搜索在代表给定任务的开发数据库上或基于歧视指标的每种话语上优化了加权函数的参数。使用Aurora-4任务进行的实验表明,与无谱变换和使用无味变换的不确定性传播的基线系统相比,通过干净的训练,所提出的加权方案最多可以将WER降低21%。不确定性加权方法缩小了干净与多噪声/多条件训练之间的差距。当在类似于测试条件的条件下训练DNN-HMM系统不容易时,这将很有用。最后,关于不确定性使用的结果与在其他地方使用与此处使用的数据库相同的数据库的结果具有竞争力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号