Non-intrusive speech quality assessment using multi-resolution auditory model features for degraded narrowband speech

Dubey Rajesh Kumar; Kumar Arun

首页> 外文期刊>Signal Processing, IET >Non-intrusive speech quality assessment using multi-resolution auditory model features for degraded narrowband speech

【24h】

Non-intrusive speech quality assessment using multi-resolution auditory model features for degraded narrowband speech

机译：使用多分辨率听觉模型特征对降级窄带语音进行非侵入式语音质量评估

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

A multi-resolution framework using auditory perception-based wavelet packet transform is invoked in multi-resolution auditory model (MRAM) and used for non-intrusive objective speech quality estimation. The MRAM provides a detailed time-frequency modelling of the human auditory system compared to earlier models that have been used for non-intrusive speech quality estimation. The objective Mean Opinion Score (MOS) of a degraded narrowband speech utterance has been estimated by Gaussian Mixture Model (GMM) probabilistic approach using MRAM-based feature vector. Additionally, a recent auditory model (Lyons' auditory model) based features, mel-frequency cepstral coefficients (MFCC), and line spectral frequencies (LSF) features have also been used independently for comparison of the performance of MRAM features. The combination of MFCC and LSF features with MRAM features for non-intrusive speech quality estimation using GMM probabilistic approach has been proposed and investigated. The performance of these feature vectors has been evaluated and compared with ITU-T Recommendation P.563 and a recent published work by computing correlation coefficient and root-mean-square error between the subjective MOS and the estimated objective MOS. It is found that the proposed method that uses a combination of MRAM features, MFCC, and LSF feature vectors for non-intrusive speech quality performs better than both the other algorithms.

机译：在多分辨率听觉模型（MRAM）中调用使用基于听觉感知的小波包变换的多分辨率框架，并将其用于非侵入式客观语音质量估计。与已经用于非侵入式语音质量估计的早期模型相比，MRAM提供了人类听觉系统的详细时频建模。高斯混合模型（GMM）概率方法已使用基于MRAM的特征向量估算了降级的窄带语音发声的客观平均观点得分（MOS）。此外，基于听觉模型（Lyons的听觉模型）的特征，梅尔频率倒谱系数（MFCC）和线谱频率（LSF）特征也已独立用于MRAM特征的性能比较。提出并研究了MFCC和LSF特征与MRAM特征的结合，用于使用GMM概率方法进行非介入语音质量估计。通过计算主观MOS与估算的目标MOS之间的相关系数和均方根误差，已对这些特征向量的性能进行了评估，并与ITU-T P.563建议书和最近发表的工作进行了比较。结果发现，所提出的方法结合使用了MRAM特征，MFCC和LSF特征向量来实现非介入语音质量，其效果优于其他两种算法。

著录项

来源
《Signal Processing, IET》 |2015年第9期|638-646|共9页
作者
Dubey Rajesh Kumar; Kumar Arun;
展开▼
作者单位

Center for Appl. Res. in Electron., Indian Inst. of Technol.-Delhi, New Delhi, India;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Non-intrusive speech quality assessment using several combinations of auditory features [J] . Rajesh Kumar Dubey, Arun Kumar International journal of speech technology . 2013,第1期

机译：使用听觉特征的几种组合进行非侵入性语音质量评估
2. Non-intrusive speech quality assessment using several combinations of auditory features [J] . Rajesh Kumar Dubey, Arun Kumar International Journal of Speech Technology . 2013,第1期

机译：使用听觉特征的几种组合进行非侵入性语音质量评估
3. Non-Intrusive Objective Speech Quality Measurement Based on Fuzzy GMM and SVR for Narrowband Speech [J] . 王晶, 张莹, 赵胜辉, 北京理工大学学报：英文版 . 2010,第001期

机译：基于模糊GMM和SVR的窄带语音非侵入式客观语音质量测量
4. Lyon's auditory features and MRAM features comparison for non-intrusive speech quality assessment in narrowband speech [C] . Rajesh Kumar Dubey, Arun Kumar International Conference on Signal Processing and Integrated Networks . 2016

机译：里昂的听觉特征和MRAM特征的比较，用于窄带语音中的非介入语音质量评估
5. Data-Driven Non-Intrusive Speech Quality and Intelligibility Assessment [D] . Dong, Xuan. 2021

机译：数据驱动的非侵入式语音质量和可智能性评估
6. Degraded Auditory Processing in a Rat Model of Autism Limits the Speech Representation in Non-primary Auditory Cortex [O] . C.T. Engineer, T.M. Centanni, K.W. Im, -1

机译：自闭症大鼠模型中的听觉处理退化限制了非主要听觉皮层中的语音表达
7. ENHANCED NON-INTRUSIVE SPEECH QUALITY MEASUREMENT USING DEGRADATION MODELS [O] . Tiago H. Falk, Wai-yip Chan 2013

机译：使用降级模型增强非侵入式语音质量测量
8. Intelligibility of ICAO (International Civil Aviation Organization) Spelling Alphabet Words and Digits Using Severely Degraded Speech Communication Systems. Part 1. Narrowband Digital Speech [R] . Schmidt-Nielsen, A. 1987

机译：国际民航组织（国际民用航空组织）使用严重恶化的语音通信系统拼写字母单词和数字的可懂度。第1部分。窄带数字语音

Non-intrusive speech quality assessment using multi-resolution auditory model features for degraded narrowband speech

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅