首页> 外文期刊>IEEE transactions on audio, speech and language processing >A Multiresolution Model of Auditory Excitation Pattern and Its Application to Objective Evaluation of Perceived Speech Quality
【24h】

A Multiresolution Model of Auditory Excitation Pattern and Its Application to Objective Evaluation of Perceived Speech Quality

机译:听觉激励模式的多分辨率模型及其在语音质量客观评估中的应用

获取原文
获取原文并翻译 | 示例

摘要

This paper proposes a multiresolution model of auditory excitation pattern and applies it to the problem of objective evaluation of subjective wideband speech quality. The model uses wavelet packet transform for time-frequency decomposition of the input signal. The selection of the wavelet packet tree is based on an optimality criterion formulated to minimize a cost function based on the critical band structure. The models of the different auditory phenomena are reformulated for the multiresolution framework. This includes the proposition of duration dependent outer and middle ear weighting, multiresolution spectral spreading, and multiresolution temporal smearing. As an application, the excitation pattern is used to define an objective measure of auditory distortion of a distorted speech signal compared to the undistorted one. The performance of this objective measure is evaluated with a database of various kinds of NOISEX-92 degraded wideband speech signals in predicting the subjective mean opinion score (MOS) and is compared with the fast Fourier transform (FFT)-based ITU-T PESQ P.862.2 algorithm. The proposed measure is found to achieve comparable correlation between subjective MOS and objective MOS as PESQ P.862.2, with a trend suggesting better correlation for the nonstationary degradations compared to the stationary ones. Further refinement of the measure for distortion types other than additive noise is anticipated.
机译:本文提出了听觉激励模式的多分辨率模型,并将其应用于主观宽带语音质量的客观评价问题。该模型使用小波包变换对输入信号进行时频分解。小波包树的选择基于最优准则,该最优准则被制定为基于临界频带结构最小化成本函数。为多分辨率框架重新制定了不同听觉现象的模型。这包括与持续时间有关的外耳和中耳加权,多分辨率频谱扩展和多分辨率时间拖尾的建议。作为一种应用,与未失真的语音信号相比,激励模式用于定义失真语音信号的听觉失真的客观度量。在预测主观平均意见得分(MOS)时,使用各种NOISEX-92降级的宽带语音信号的数据库对这一客观措施的性能进行了评估,并将其与基于快速傅里叶变换(FFT)的ITU-T PESQ P进行了比较。 .862.2算法。发现拟议的措施可以在主观MOS和客观MOS之间实现可比的PESQ P.862.2相关性,并且趋势表明,与平稳退化相比,非平稳退化的相关性更好。预期除加性噪声以外的失真类型的度量的进一步改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号