Single-Ended Speech Quality Prediction Based on Automatic Speech Recognition

RAINER HUBER; JASPER OOSTER; BERND T. MEYER

首页> 外文期刊>Journal of the Audio Engineering Society >Single-Ended Speech Quality Prediction Based on Automatic Speech Recognition

【24h】

Single-Ended Speech Quality Prediction Based on Automatic Speech Recognition

机译：基于语音自动识别的单端语音质量预测

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

A new single-ended speech quality measure is proposed that uses a deep neural network (DNN)-based automatic speech recognition system. A quality measure is used to quantify the degradation of the DNN output (phoneme posterior probabilities or posteriorgrams) caused by speech distortions. The new method was evaluated using five databases containing nine subsets of data covering several conditions of narrowband and broadband speech, distorted by speech codecs, telecommunication networks, clipping, chopped speech, echoes, competing speakers, and additional background noises. Since our model is trained as a speaker-independent speech-specific system, it is not suited for predicting speech quality in the presence of a background speaker. The evaluation results of all remaining eight data subsets show that good average correlations with subjective speech quality ratings are achieved without any task-specific training or optimizations (r = 0.81). These average results are close to those achieved with the American National Standard ANIQUE+ (r = 0.83) and clearly better than those obtained with the ITU-T standard P.563 (r = 0.58).

机译：提出了一种新的单端语音质量度量，该度量使用基于深度神经网络（DNN）的自动语音识别系统。质量度量用于量化由语音失真引起的DNN输出（音素后验概率或后验图）的降级。使用五个数据库对新方法进行了评估，该数据库包含九个数据子集，这些数据子集涵盖了窄带和宽带语音的几种条件，这些语音子集会因语音编解码器，电信网络，削波，斩波，回声，竞争者讲话以及其他背景噪声而失真。由于我们的模型被训练为独立于说话者的语音专用系统，因此它不适合在有背景说话者的情况下预测语音质量。所有其余八个数据子集的评估结果表明，无需进行任何针对特定任务的培训或优化即可获得具有主观语音质量评级的良好平均相关性（r = 0.81）。这些平均结果接近于美国国家标准ANIQUE +（r = 0.83）所获得的结果，并且明显优于采用ITU-T标准P.563（r = 0.58）所获得的结果。

著录项

来源
《Journal of the Audio Engineering Society》 |2018年第10期|759-769|共11页
作者
RAINER HUBER; JASPER OOSTER; BERND T. MEYER;
展开▼
作者单位

Medizinische Physik and Cluster of Excellence Hearing4all, University of Oldenburg, Oldenburg, Germany;

Medizinische Physik and Cluster of Excellence Hearing4all, University of Oldenburg, Oldenburg, Germany;

Medizinische Physik and Cluster of Excellence Hearing4all, University of Oldenburg, Oldenburg, Germany;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Speech confusion index (Φ): A confusion-based speech quality indicator and recognition rate prediction for dysarthria [J] . Prakasith Kayasith, Thanaruk Theeramunkong Computers & mathematics with applications . 2009,第8期

机译：语音混淆指数（Φ）：基于混淆的语音质量指标和构音障碍的识别率预测
2. Speech Clarity Index (Ψ): A Distance-based Speech Quality Indicator And Recognition Rate Prediction For Dysarthric Speakers With Cerebral Palsy [J] . Prakasith KAYASITH, Thanaruk THEERAMUNKONG IEICE Transactions on Information and Systems . 2009,第3期

机译：语音清晰度指数（Ψ）：基于距离的语音麻痹性说话者说话者的语音质量指标和识别率预测
3. Application of automatic speech recognition to quantitative assessment of tracheoesophageal speech with different signal quality. [J] . Haderlein T, Riedhammer K, Noth E, Folia phoniatrica et logopaedica: official organ of the International Association of Logopedics and Phoniatrics (IALP) . 2009,第1期

机译：自动语音识别在不同信号质量气管食管语音定量评估中的应用。
4. Automatic Speech Recognition for Assistive Writing in Speech Supplemented Word Prediction [C] . John-Paul Hosom, Tom Jakobs, Allen Baker, Annual conference of the International Speech Communication Association;INTERSPEECH 2010 . 2011

机译：语音辅助单词预测中辅助写作的自动语音识别
5. HMM-based non-intrusive speech quality and implementation of Viterbi score distribution and hiddenness based measures to improve the performance of speech recognition [D] . Talwar, Gaurav 2006

机译：基于HMM的非侵入式语音质量以及基于Viterbi分数分布和隐蔽性的措施的实施，以提高语音识别的性能
6. A Speech Recognition-based Solution for the Automatic Detection of Mild Cognitive Impairment from Spontaneous Speech [O] . László Tóth, Ildikó Hoffmann, Gábor Gosztolya, -1

机译：基于语音识别的自发性语音自动检测轻度认知障碍的解决方案
7. Speech confusion index (Φ): A confusion-based speech quality indicator and recognition rate prediction for dysarthria [O] . Kayasith Prakasith, Theeramunkong Thanaruk 2009

机译：语音混淆指数（Φ）：基于混淆的语音质量指标和构音障碍的识别率预测

Single-Ended Speech Quality Prediction Based on Automatic Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅