Estimation of cepstral coefficients for robust speech recognition.

机译：倒频谱系数的估计，用于鲁棒的语音识别。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This dissertation introduces a new approach to estimation of the features used in an automatic speech recognition system operating in noisy environments, namely mel-frequency cepstral coefficients. A major challenge in the development of an estimator for these features is the nonlinear interaction between a speech signal and the corrupting ambient noise. Previous estimation methods have attempted to deal with this issue with the use of a low order Taylor series expansion, which results in a rough approximation of the true distortion interaction between the speech and noise signal components, and the estimators must typically be iterative, as it is the speech features themselves that are used as expansion points. The new estimation approach, named the additive cepstral distortion model minimum mean-square error estimator, uses a novel distortion model to avoid the necessity of a Taylor series expansion, allowing for a direct solution.;Like many previous approaches, the estimator introduced in this dissertation uses a prior distribution model of the speech features. In previous work, this distribution is limited in specificity, as a single global model is trained over an entire set of speech data. The estimation approach introduced in this work extends this method to incorporate contextual information into the prior model, leading to a more specific distribution and subsequently better estimates of the features.;An introduction to automatic speech recognition is presented, and a historical review of relevant feature estimation research is given. The new estimation approach is developed, along with a method for implementing more specific prior distribution modeling, and the new feature estimation approach is evaluated on two standard robust speech recognition datasets.

机译：本文介绍了一种新的方法来估计在嘈杂环境下运行的自动语音识别系统中使用的特征，即梅尔频率倒谱系数。针对这些特征的估计器的发展中的主要挑战是语音信号与恶化的环境噪声之间的非线性相互作用。以前的估计方法已经尝试使用低阶泰勒级数展开来解决此问题，这导致语音和噪声信号分量之间的真实失真交互作用的大致近似，并且估计符通常必须是迭代的，因为它本身就是语音特征，用作扩展点。这种新的估计方法称为加性倒谱失真模型最小均方误差估计器，它使用一种新颖的失真模型来避免泰勒级数展开的必要性，从而可以直接求解;与许多以前的方法一样，本文引入了估计器本文采用语音特征的先验分布模型。在以前的工作中，由于在整个语音数据集上训练了一个全局模型，因此这种分布的特异性受到限制。在这项工作中引入的估计方法将这种方法扩展为将上下文信息合并到先前的模型中，从而导致更具体的分布并随后对特征进行了更好的估计。给出估计研究。开发了新的估计方法，以及用于实现更具体的先验分布建模的方法，并在两个标准的鲁棒语音识别数据集上评估了新的特征估计方法。

著录项

作者
Indrebo, Kevin M.;
展开▼
作者单位

Marquette University.;

展开▼
授予单位 Marquette University.;
学科 Engineering Electronics and Electrical.
学位 Ph.D.
年度 2008
页码 135 p.
总页数 135
原文格式 PDF
正文语种 eng
中图分类无线电电子学、电信技术;
关键词

相似文献

外文文献
中文文献
专利

1. Robust Acoustic Speech Feature Prediction From Noisy Mel-Frequency Cepstral Coefficients [J] . Milner B., Darch J. Audio, Speech, and Language Processing, IEEE Transactions on . 2011,第2期

机译：基于嘈杂的梅尔频率倒谱系数的鲁棒声学语音特征预测
2. Automatic Voice Pathology Detection With Running Speech by Using Estimation of Auditory Spectrum and Cepstral Coefficients Based on the All-Pole Model [J] . Ali Zulfiqar, Elamvazuthi Irraivan, Alsulaiman Mansour, Journal of voice: official journal of the Voice Foundation . 2016,第6期

机译：基于全极点模型的听觉频谱和倒谱系数估计的运行语音自动语音病理检测
3. Robust optimal sub-band wavelet cepstral coefficient method for speech recognition [J] . John Sahaya Rani Alex, Nithya Venkatesan International Journal of Computer Aided Engineering and Technology . 2019,第2期

机译：语音识别的鲁棒最优子带小波倒谱系数方法
4. Mean normalization of power function based cepstral coefficients for robust speech recognition in noisy environment [C] . Baek Soonho, Kang Hong-Goo IEEE International Conference on Acoustics, Speech and Signal Processing . 2014

机译：基于幂函数的倒谱系数的平均归一化，可在嘈杂的环境中实现鲁棒的语音识别
5. The modified-mean cepstral mean normalization (MMCMN) method for channel-robust automatic speaker recognition. [D] . Garcia, Alvin A. 2002

机译：改进的均值倒谱均值归一化（MMCMN）方法用于声道鲁棒性自动说话人识别。
6. The application of fractional Mel cepstral coefficient in deceptive speech detection [O] . Xinyu Pan, Heming Zhao, Yan Zhou -1

机译：分数梅尔倒谱系数在欺骗性语音检测中的应用
7. Gammatone Wavelet Cepstral Coefficients for Robust Speech Recognition [O] . Aniruddha Adiga, Mathew Magimai. -doss, Chandra Sekhar Seelamantula 2015

机译：用于鲁棒语音识别的Gammatone小波倒谱系数
8. Damped Oscillator Cepstral Coefficients for Robust Speech Recognition. [R] . Mitra, V., Franco, H., Graciarena, M. 2013

机译：用于鲁棒语音识别的阻尼振荡器倒谱系数。

Estimation of cepstral coefficients for robust speech recognition.

摘要

著录项

相似文献

相关主题

期刊订阅