Uncertainty weighting and propagation in DNN-HMM-based speech recognition

Jose Novoa; Josue Fredes; Victor Poblete; Nestor Becerra Yoma

首页> 外文期刊>Computer speech and language >Uncertainty weighting and propagation in DNN-HMM-based speech recognition

【24h】

Uncertainty weighting and propagation in DNN-HMM-based speech recognition

机译：基于DNN-HMM的语音识别中的不确定性加权和传播

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this paper an uncertainty weighting scheme for DNN-HMM-based speech recognition is proposed to increase discrimina-bility in the decoding process. To this end, the DNN pseudo-log-likelihoods are weighted according to the uncertainty variance assigned to the acoustic observation. The results presented here suggest that substantial reduction in WER is achieved with clean training. Moreover, modelling the uncertainty propagation through the DNN is not required and no approximations for non-linear activation functions are made. The presented method can be applied to any network topology that delivers log-likelihood-like scores. It can be combined with any noise removal technique and adds a minimal computational cost. This technique was exhaustively evaluated and combined with uncertainty-propagation-based schemes for computing the pseudo-log-likelihoods and uncertainty variance at the DNN output. Two proposed methods optimized the parameters of the weighting function by leveraging the grid search either on a development database representing the given task or on each utterance based on discrimination metrics. Experiments with Aurora-4 task showed that, with clean training, the proposed weighting scheme can reduce WER by a maximum of 21 % compared with a baseline system with spectral subtraction and uncertainty propagation using the unscented transform. The uncertainty weighting method reduced the gap between clean and multi-noise/multi-condition training. This can be useful when it is not easy to train a DNN-HMM system in conditions that are similar to the testing ones. Finally, the presented results on the use of uncertainty are very competitive with those published elsewhere using the same database as the one employed here.

机译：本文提出了一种基于DNN-HMM的语音识别不确定性加权方案，以提高解码过程的可识别性。为此，根据分配给声学观测的不确定性方差对DNN伪对数似然度进行加权。这里介绍的结果表明，通过干净的培训可以大大降低WER。此外，不需要对通过DNN传播的不确定性进行建模，并且不对非线性激活函数进行近似处理。所提出的方法可以应用于提供对数似然分数的任何网络拓扑。它可以与任何噪声消除技术结合使用，并增加了最小的计算成本。对该技术进行了详尽的评估，并与基于不确定性传播的方案相结合，用于计算DNN输出处的伪对数似然和不确定性方差。两种提出的方法通过利用网格搜索在代表给定任务的开发数据库上或基于歧视指标的每种话语上优化了加权函数的参数。使用Aurora-4任务进行的实验表明，与无谱变换和使用无味变换的不确定性传播的基线系统相比，通过干净的训练，所提出的加权方案最多可以将WER降低21％。不确定性加权方法缩小了干净与多噪声/多条件训练之间的差距。当在类似于测试条件的条件下训练DNN-HMM系统不容易时，这将很有用。最后，关于不确定性使用的结果与在其他地方使用与此处使用的数据库相同的数据库的结果具有竞争力。

著录项

来源
《Computer speech and language》 |2018年第1期|30-46|共17页
作者
Jose Novoa; Josue Fredes; Victor Poblete; Nestor Becerra Yoma;
展开▼
作者单位

Speech Processing and Transmission Laboratory, Electrical Engineering Department, University of Chile, Santiago, Chile;

Speech Processing and Transmission Laboratory, Electrical Engineering Department, University of Chile, Santiago, Chile;

Institute of Acoustics, Universidad Austral de Chile, Valdivia, Chile;

Speech Processing and Transmission Laboratory, Electrical Engineering Department, University of Chile, Santiago, Chile;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Automatic speech recognition; Deep neural network; Uncertainty weighting; Uncertainty propagation; DNN-HMM;

机译：自动语音识别;深度神经网络不确定度加权;不确定性传播;人工神经网络;

相似文献

外文文献
中文文献
专利

1. Assessment of dysarthric speech using Elman back propagation network (recurrent network) for speech recognition [J] . S. Selva Nidhyananthan, R. Shantha Selva kumari, V. Shenbagalakshmi International journal of speech technology . 2016,第3期

机译：使用Elman反向传播网络（递归网络）评估发音异常的语音以进行语音识别
2. Effects of Second Language Proficiency and Linguistic Uncertainty on Recognition of Speech in Native and Nonnative Competing Speech [J] . Francis Alexander L., Tigchelaar Laura J., Zhang Rongrong, Journal of speech, language, and hearing research: JSLHR . 2018,第7期

机译：第二语言能力和语言不确定性对本土和非竞争言论致辞认识的影响
3. Dual-channel spectral weighting for robust speech recognition in mobile devices [J] . Lopez-Espejo Ivan, Peinado Antonio M., Gomez Angel M., Digital Signal Processing . 2018,第期

机译：移动设备中强大语音识别的双通道光谱加权
4. Tight Coupling of Speech Recognition and Dialog Management - Dialog-Context Dependent Grammar Weighting for Speech Recognition [C] . Christian Fuegen, Hartwig Holzapfel, Alex Waibel International Conference on Spoken Language Processing; 20041004-08; Jeju(KR) . 2004

机译：语音识别和对话管理的紧密耦合-依赖于对话上下文的语法加权来进行语音识别
5. Statistical machine translation and automatic speech recognition under uncertainty [D] . Mathias, Lambert 2008

机译：统计机器翻译和不确定性下的自动语音识别
6. Recognition of time-compressed speech does not predict recognition of natural fast-rate speech by older listeners [O] . Sandra Gordon-Salant, Danielle J. Zion, Carol Espy-Wilson -1

机译：时间压缩语音的识别无法预测年长听众对自然快速语音的识别
7. Maximum likelihood weighting of dynamic speech features for CDHMM speech recognition [O] . Hernando Pericás, Francisco Javier 1997

机译：CDHmm语音识别的动态语音特征的最大似然加权

Uncertainty weighting and propagation in DNN-HMM-based speech recognition

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅