Maximum Likelihood Normalization for Robust Speech Recognition

机译：最大似然归一化，用于鲁棒语音识别

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

It is well-known that additive and channel noise cause shift and scaling in MFCC features. Empirical normalization techniques to estimate and compensate for the effects, such as cep-stral mean subtraction and variance normalization, have been shown to be useful. However, these empirical estimate may not be optimal. In this paper, we approach the problem from two directions, 1) use a more robust MFCC-based features that is less sensitive to additive and channel noise and 2) propose a maximum likelihood (ML) based approach to compensate the noise effect. In addition, we proposed the use of multi-class normalization in which different normalization factors can be applied to different phonetic units. The combination of the robust features and ML normalization is particularly useful for highly mis-matched condition in the Aurora 3 corpus resulting in a 15.8% relative improvement in the highly mis-matched case and a 10.4% relative improvement on average over the three conditions.

机译：众所周知，加性和通道噪声会导致MFCC功能部件发生偏移和缩放。经验估计归一化技术可以有效地估计和补偿影响，例如等值均值减法和方差归一化。但是，这些经验估计可能不是最佳的。在本文中，我们从两个方向解决这个问题：1）使用更健壮的基于MFCC的功能，这些功能对加性和通道噪声较不敏感； 2）提出了基于最大似然（ML）的方法来补偿噪声影响。另外，我们提出了使用多类归一化的方法，其中可以将不同的归一化因子应用于不同的语音单元。鲁棒性功能和ML归一化的组合对于Aurora 3语料库中高度不匹配的条件特别有用，可导致高度不匹配的情况相对改善15.8％，在这三个条件下平均可以相对改善10.4％。

著录项

来源
《European Conference on Speech Communication and Technology - EUROSPEECH 2003(INTERSPEECH 2003) vol.1; 20030901-04; Geneva(CH)》|2003年|P.13-16|共4页
会议地点 Geneva(CH)
作者
Yiu-Pong LAI; Man-Hung SIU;
展开▼
作者单位

Department of Electrical and Electronic Engineering Hong Kong University of Science and Technology, Hong Kong;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动信息理论;
关键词
入库时间 2022-08-26 13:48:55

相似文献

外文文献
中文文献
专利

1. A Beamforming Algorithm Based on Maximum Likelihood of a Complex Gaussian Distribution With Time-Varying Variances for Robust Speech Recognition [J] . Byung Joon Cho, Jun-Min Lee, Hyung-Min Park IEEE signal processing letters . 2019,第9期

机译：基于最大似然分布且具有时变方差的复杂高斯分布的波束成形算法用于鲁棒语音识别
2. Maximum likelihood subband polynomial regression for robust speech recognition [J] . Yong Lue, Zhenyang Wu Applied Acoustics . 2013,第5期

机译：最大似然子带多项式回归用于鲁棒语音识别
3. Noisy Constrained Maximum-Likelihood Linear Regression for Noise-Robust Speech Recognition [J] . Kim D. K., Gales M. J. F. Audio, Speech, and Language Processing, IEEE Transactions on . 2011,第2期

机译：嘈杂约束最大似然线性回归用于鲁棒语音识别
4. Maximum Likelihood Normalization for Robust Speech Recognition [C] . Yiu-Pong LAI, Man-Hung SIU, International Speech Communication Association(ISCA) European Conference on Speech Communication and Technology . 2003

机译：强大的语音识别的最大似然归一化
5. Acoustic modeling and speaker normalization strategies with application to robust in-vehicle speech recognition and dialect classification. [D] . Yapanel, Umit. 2005

机译：声学建模和说话人归一化策略及其在强大的车载语音识别和方言分类中的应用。
6. Robust unified Granger causality analysis: a normalized maximum likelihood form [O] . Zhenghui Hu, Fei Li, Minjia Cheng, 2021

机译：强大的统一格兰杰因果关系分析：归一化最大可能性形式
7. Feature generation based on maximum normalized acoustic likelihood for improved speech recognition [O] . Xiang Li, Richard M. Stern 2008

机译：基于最大归一化声学似然的特征生成以改进语音识别
8. Improving on hidden Markov models: An articulatorily constrained, maximum likelihood approach to speech recognition and speech coding [R] . Hogden, J. 1996

机译：改进隐马尔可夫模型：语音识别和语音编码的语义约束，最大似然方法

Maximum Likelihood Normalization for Robust Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅