【24h】

Minimum Variance Distortionless Response on a Warped Frequency Scale

机译:扭曲频率范围内的最小方差无失真响应

获取原文
获取原文并翻译 | 示例

摘要

In this work we propose a time domain technique to estimate an all-pole model based on the minimum variance distortionless response (MVDR) using a warped short time frequency axis such as the Mel scale. The use of the MVDR eliminates the overemphasis of harmonic peaks typically seen in medium and high pitched voiced speech when spectral estimation is based on linear prediction (LP). Moreover, warping the frequency axis prior to MVDR spectral estimation ensures more parameters in the spectral model are allocated to the low, as opposed to high, frequency regions of the spectrum, thereby mimicking the human auditory system. In a series of speech recognition experiments on the Switchboard Corpus (spontaneous English telephone speech), the proposed approach achieved a word error rate (WER) of 32.1% for female speakers, which is clearly superior to the 33.2% WER obtained by the usual combination of Mel warping and linear prediction.
机译:在这项工作中,我们提出了一种时域技术,该技术使用扭曲的短时频率轴(例如梅尔标度)基于最小方差无失真响应(MVDR)估算全极点模型。当频谱估计基于线性预测(LP)时,MVDR的使用消除了谐波峰值的过强调,该谐波峰值通常出现在中音和高音语音中。此外,在MVDR频谱估计之前对频率轴进行扭曲可确保将频谱模型中的更多参数分配给频谱的低频(而不是高频)区域,从而模仿人类听觉系统。在“总机”语料库(自发英语电话语音)上进行的一系列语音识别实验中,所提出的方法为女性说话者实现了32.1%的单词错误率(WER),明显优于通常组合所获得的33.2%的WER梅尔翘曲和线性预测。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号