首页> 外文期刊>Signal Processing, IET >Non-intrusive speech quality assessment using multi-resolution auditory model features for degraded narrowband speech
【24h】

Non-intrusive speech quality assessment using multi-resolution auditory model features for degraded narrowband speech

机译:使用多分辨率听觉模型特征对降级窄带语音进行非侵入式语音质量评估

获取原文
获取原文并翻译 | 示例
       

摘要

A multi-resolution framework using auditory perception-based wavelet packet transform is invoked in multi-resolution auditory model (MRAM) and used for non-intrusive objective speech quality estimation. The MRAM provides a detailed time-frequency modelling of the human auditory system compared to earlier models that have been used for non-intrusive speech quality estimation. The objective Mean Opinion Score (MOS) of a degraded narrowband speech utterance has been estimated by Gaussian Mixture Model (GMM) probabilistic approach using MRAM-based feature vector. Additionally, a recent auditory model (Lyons' auditory model) based features, mel-frequency cepstral coefficients (MFCC), and line spectral frequencies (LSF) features have also been used independently for comparison of the performance of MRAM features. The combination of MFCC and LSF features with MRAM features for non-intrusive speech quality estimation using GMM probabilistic approach has been proposed and investigated. The performance of these feature vectors has been evaluated and compared with ITU-T Recommendation P.563 and a recent published work by computing correlation coefficient and root-mean-square error between the subjective MOS and the estimated objective MOS. It is found that the proposed method that uses a combination of MRAM features, MFCC, and LSF feature vectors for non-intrusive speech quality performs better than both the other algorithms.
机译:在多分辨率听觉模型(MRAM)中调用使用基于听觉感知的小波包变换的多分辨率框架,并将其用于非侵入式客观语音质量估计。与已经用于非侵入式语音质量估计的早期模型相比,MRAM提供了人类听觉系统的详细时频建模。高斯混合模型(GMM)概率方法已使用基于MRAM的特征向量估算了降级的窄带语音发声的客观平均观点得分(MOS)。此外,基于听觉模型(Lyons的听觉模型)的特征,梅尔频率倒谱系数(MFCC)和线谱频率(LSF)特征也已独立用于MRAM特征的性能比较。提出并研究了MFCC和LSF特征与MRAM特征的结合,用于使用GMM概率方法进行非介入语音质量估计。通过计算主观MOS与估算的目标MOS之间的相关系数和均方根误差,已对这些特征向量的性能进行了评估,并与ITU-T P.563建议书和最近发表的工作进行了比较。结果发现,所提出的方法结合使用了MRAM特征,MFCC和LSF特征向量来实现非介入语音质量,其效果优于其他两种算法。

著录项

  • 来源
    《Signal Processing, IET》 |2015年第9期|638-646|共9页
  • 作者

    Dubey Rajesh Kumar; Kumar Arun;

  • 作者单位

    Center for Appl. Res. in Electron., Indian Inst. of Technol.-Delhi, New Delhi, India;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号