Model-based feature enhancement with uncertainty decoding for noise robust ASR

Veronique Stouten; Hugo Van hamme; Patrick Wambacq

首页> 外文期刊>Speech Communication >Model-based feature enhancement with uncertainty decoding for noise robust ASR

【24h】

Model-based feature enhancement with uncertainty decoding for noise robust ASR

机译：基于模型的特征增强和不确定性解码，可增强噪声鲁棒ASR

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, several techniques are proposed to incorporate the uncertainty of the clean speech estimate in the decoding process of the backend recogniser in the context of model-based feature enhancement (MBFE) for noise robust speech recognition. Usually, the Gaussians in the acoustic space are sampled in a single point estimate, which means that the backend recogniser considers its input as a noise-free utterance. However, in this way the variance of the estimator is neglected. To solve this problem, it has already been argued that the acoustic space should be evaluated in a probability density function, e.g. a Gaussian observation pdf. We illustrate that this Gaussian observation pdf can be replaced by a computationally more tractable discrete pdf, consisting of a weighted sum of delta functions. We also show how improved posterior state probabilities can be obtained by calculating their maximum likelihood estimates or by using the pdf of clean speech conditioned on both the noisy speech and the backend Gaussian. Another simple and efficient technique is to replace these posterior probabilities by M Kronecker deltas, which results in M front-end feature vector candidates, and to take the maximum over their backend scores. Experimental results are given for the Aurora2 and Aurora4 database to compare the proposed techniques. A significant decrease of the word error rate of the resulting speech recognition system is obtained.

机译：在本文中，提出了几种技术，以在基于模型的特征增强（MBFE）的背景下将纯净语音估计的不确定性纳入后端识别器的解码过程中，以增强噪声鲁棒性语音识别。通常，在声学空间中的高斯信号是在单点估计中采样的，这意味着后端识别器将其输入视为无噪声话语。然而，以这种方式，估计器的方差被忽略。为了解决这个问题，已经有人提出应该用概率密度函数来评估声空间，如高斯观测pdf。我们说明，该高斯观测pdf可以被计算上更易处理的离散pdf取代，该离散pdf由增量函数的加权和组成。我们还展示了如何通过计算其最大似然估计或使用以嘈杂语音和后端高斯为条件的干净语音pdf来获得改进的后态概率。另一种简单有效的技术是用M个Kronecker三角洲代替这些后验概率，从而产生M个前端特征向量候选对象，并在其后端分数上取最大。给出了Aurora2和Aurora4数据库的实验结果，以比较所提出的技术。得到的语音识别系统的词错误率大大降低。

著录项

来源
《Speech Communication》 |2006年第11期|p. 1502-1514|共13页
作者
Veronique Stouten; Hugo Van hamme; Patrick Wambacq;
展开▼
作者单位

Katholieke Universiteit Leuven, Department ESAT, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium;

Katholieke Universiteit Leuven, Department ESAT, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium;

Katholieke Universiteit Leuven, Department ESAT, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类语言、文字;
关键词
Noise robust speech recognition; Model-based feature enhancement; Additive noise; Convolutional noise; Uncertainty decoding;

机译：噪声鲁棒的语音识别;基于模型的特征增强;加性噪声;卷积噪声;不确定性解码;

相似文献

外文文献
中文文献
专利

1. A Pitch-Synchronous Peak-Amplitude Based Feature Extraction Method with Noise Reduction, Modulation Enhancement, and Masking for Noise-Robust ASR [J] . Muhammad GHULAM, Junsei HORIKAWA, Tsuneo NITTA 電子情報通信学会技術研究報告. 音声. Speech . 2005,第496期

机译：一种基于变桨同步峰值幅度的降噪，调制增强和掩蔽的鲁棒ASR特征提取方法
2. A Pitch-Synchronous Peak-Amplitude Based Feature Extraction Method with Noise Reduction, Modulation Enhancement, and Masking for Noise-Robust ASR [J] . Muhammad GHULAM, Junsei HORIKAWA, Tsuneo NITTA, 電子情報通信学会技術研究報告. 音声. Speech . 2005,第496期

机译：一种基于变桨同步峰值幅度的降噪，调制增强和掩蔽的鲁棒ASR特征提取方法
3. A Pitch-Synchronous Peak-Amplitude Based Feature Extraction Method with Noise Reduction, Modulation Enhancement, and Masking for Noise-Robust ASR [J] . Muhammad GHULAM, Junsei HORIKAWA, Tsuneo NITTA 電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication . 2005,第494期

机译：一种基于变桨同步峰值幅度的降噪，调制增强和掩蔽的鲁棒ASR特征提取方法
4. Consistent DNN uncertainty training and decoding for robust ASR [C] . Karan Nathwani, Emmanuel Vincent, Irina Illina 2017 IEEE Automatic Speech Recognition and Understanding Workshop . 2017

机译：鲁棒ASR的一致DNN不确定性训练和解码
5. Local feature extraction for robust speech recognition in the presence of noise. [D] . Tufekci, Zekeriya. 2001

机译：局部特征提取可在存在噪声的情况下实现强大的语音识别。
6. Model-based feature construction for multivariate decoding [O] . Kay H. Brodersen, Florent Haiss, Cheng Soon Ong, -1

机译：基于模型的多变量解码特征构造
7. DNN Uncertainty Propagation using GMM-Derived Uncertainty Features for Noise Robust ASR [O] . Nathwani, Karan, Vincent, Emmanuel, Illina, Irina 2018

机译：使用GMM衍生的不确定性特征进行DNN不确定性传播，以实现噪声稳健的ASR
8. Development of a Framework for Model-Based Analysis, Uncertainty Quantification, and Robust Control Design of Nonlinear Smart Composite Systems. [R] . Smith, R. C. 2015

机译：基于模型的分析，不确定性量化和非线性智能复合系统鲁棒控制设计框架的开发。

Model-based feature enhancement with uncertainty decoding for noise robust ASR

摘要

著录项

相似文献

相关主题

期刊订阅