首页> 外文学位 >Estimation of cepstral coefficients for robust speech recognition.
【24h】

Estimation of cepstral coefficients for robust speech recognition.

机译:倒频谱系数的估计,用于鲁棒的语音识别。

获取原文
获取原文并翻译 | 示例

摘要

This dissertation introduces a new approach to estimation of the features used in an automatic speech recognition system operating in noisy environments, namely mel-frequency cepstral coefficients. A major challenge in the development of an estimator for these features is the nonlinear interaction between a speech signal and the corrupting ambient noise. Previous estimation methods have attempted to deal with this issue with the use of a low order Taylor series expansion, which results in a rough approximation of the true distortion interaction between the speech and noise signal components, and the estimators must typically be iterative, as it is the speech features themselves that are used as expansion points. The new estimation approach, named the additive cepstral distortion model minimum mean-square error estimator, uses a novel distortion model to avoid the necessity of a Taylor series expansion, allowing for a direct solution.;Like many previous approaches, the estimator introduced in this dissertation uses a prior distribution model of the speech features. In previous work, this distribution is limited in specificity, as a single global model is trained over an entire set of speech data. The estimation approach introduced in this work extends this method to incorporate contextual information into the prior model, leading to a more specific distribution and subsequently better estimates of the features.;An introduction to automatic speech recognition is presented, and a historical review of relevant feature estimation research is given. The new estimation approach is developed, along with a method for implementing more specific prior distribution modeling, and the new feature estimation approach is evaluated on two standard robust speech recognition datasets.
机译:本文介绍了一种新的方法来估计在嘈杂环境下运行的自动语音识别系统中使用的特征,即梅尔频率倒谱系数。针对这些特征的估计器的发展中的主要挑战是语音信号与恶化的环境噪声之间的非线性相互作用。以前的估计方法已经尝试使用低阶泰勒级数展开来解决此问题,这导致语音和噪声信号分量之间的真实失真交互作用的大致近似,并且估计符通常必须是迭代的,因为它本身就是语音特征,用作扩展点。这种新的估计方法称为加性倒谱失真模型最小均方误差估计器,它使用一种新颖的失真模型来避免泰勒级数展开的必要性,从而可以直接求解;与许多以前的方法一样,本文引入了估计器本文采用语音特征的先验分布模型。在以前的工作中,由于在整个语音数据集上训练了一个全局模型,因此这种分布的特异性受到限制。在这项工作中引入的估计方法将这种方法扩展为将上下文信息合并到先前的模型中,从而导致更具体的分布并随后对特征进行了更好的估计。给出估计研究。开发了新的估计方法,以及用于实现更具体的先验分布建模的方法,并在两个标准的鲁棒语音识别数据集上评估了新的特征估计方法。

著录项

  • 作者

    Indrebo, Kevin M.;

  • 作者单位

    Marquette University.;

  • 授予单位 Marquette University.;
  • 学科 Engineering Electronics and Electrical.
  • 学位 Ph.D.
  • 年度 2008
  • 页码 135 p.
  • 总页数 135
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 无线电电子学、电信技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号