首页> 外文期刊>IEEE Transactions on Speech and Audio Proceessing >A general joint additive and convolutive bias compensation approachapplied to noisy Lombard speech recognition
【24h】

A general joint additive and convolutive bias compensation approachapplied to noisy Lombard speech recognition

机译:通用联合加和卷积偏差补偿方法应用于嘈杂的伦巴德语音识别

获取原文
获取原文并翻译 | 示例
           

摘要

A unified approach to the acoustic mismatch problem is proposed. A maximum likelihood state-based additive bias compensation algorithm is developed for the continuous density hidden Markov model (CDHMM). Based on this technique, specific bias models in the mel cepstral and the linear spectral domains are presented. Among these models, a new polynomial trend bias model in the mel cepstral domain is derived, which proved effective for Lombard speech compensation. In addition, a joint estimation algorithm for additive and convolutive bias compensation is proposed. This algorithm is based on applying the expectation maximization (EM) technique in both above-mentioned domains, in conjunction with a parallel model combination (PMC) based transformation. The compensation of the dynamic (difference) coefficients in the proposed framework is also studied. The evaluation data base consists of a 21 confusable word vocabulary uttered by 24 speakers. Three mismatched versions of the data base are considered, i.e., Lombard speech, 15 dB noisy Lombard speech, and 5 dB noisy Lombard speech. The proposed techniques result in 50.9%, 74.6%, and 67.3% reduction in the performance difference between matched and uncompensated word error rates for the three mismatch conditions, respectively. When dynamic coefficients are considered the corresponding reductions are 46.8%, 72.4%, and 70.9%
机译:提出了一种解决声学失配问题的统一方法。针对连续密度隐马尔可夫模型(CDHMM),开发了基于最大似然状态的加性偏差补偿算法。基于这种技术,提出了梅尔倒谱和线性光谱域中的特定偏差模型。在这些模型中,推导了一个新的mel倒谱域的多项式趋势偏差模型,证明该模型对Lombard语音补偿有效。另外,提出了一种累加和卷积偏置补偿的联合估计算法。该算法基于在上述两个领域中应用期望最大化(EM)技术以及基于并行模型组合(PMC)的转换。还研究了在所提出的框架中的动态(差分)系数的补偿。评估数据库包括由24位演讲者说出的21个令人困惑的单词词汇。考虑数据库的三个不匹配版本,即,伦巴第语音,15 dB噪声伦巴德语音和5 dB噪声伦巴德语音。对于三种失配条件,所提出的技术可使匹配和未补偿的字错误率之间的性能差异分别降低50.9%,74.6%和67.3%。当考虑动态系数时,相应的减少幅度是46.8%,72.4%和70.9%

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号