Recognition of Noisy Speech: A Comparative Survey of Robust Model Architecture and Feature Enhancement

Bj?rn Schuller; Martin W?llmer; Tobias Moosmayr; Gerhard Rigoll

首页> 外文期刊>EURASIP journal on audio, speech, and music processing >Recognition of Noisy Speech: A Comparative Survey of Robust Model Architecture and Feature Enhancement

【24h】

Recognition of Noisy Speech: A Comparative Survey of Robust Model Architecture and Feature Enhancement

机译：语音识别：鲁棒模型架构和功能增强的比较调查

获取原文

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Performance of speech recognition systems strongly degrades in the presence of background noise, like the driving noise inside a car. In contrast to existing works, we aim to improve noise robustness focusing on all major levels of speech recognition: feature extraction, feature enhancement, speech modelling, and training. Thereby, we give an overview of promising auditory modelling concepts, speech enhancement techniques, training strategies, and model architecture, which are implemented in an in-car digit and spelling recognition task considering noises produced by various car types and driving conditions. We prove that joint speech and noise modelling with a Switching Linear Dynamic Model (SLDM) outperforms speech enhancement techniques like Histogram Equalisation (HEQ) with a mean relative error reduction of 52.7% over various noise types and levels. Embedding a Switching Linear Dynamical System (SLDS) into a Switching Autoregressive Hidden Markov Model (SAR-HMM) prevails for speech disturbed by additive white Gaussian noise.

机译：语音识别系统的性能在存在背景噪音（例如汽车内的行驶噪音）的情况下会大大降低。与现有作品相比，我们旨在提高语音识别的所有主要级别上的噪声鲁棒性：特征提取，特征增强，语音建模和训练。因此，我们对有前途的听觉建模概念，语音增强技术，训练策略和模型体系结构进行了概述，这些概念在考虑到各种汽车类型和驾驶条件产生的噪音的车内数字和拼写识别任务中实现。我们证明，使用切换线性动态模型（SLDM）的联合语音和噪声建模优于直方图均衡（HEQ）等语音增强技术，在各种噪声类型和级别上的平均相对误差减少了52.7％。对于受加性高斯白噪声干扰的语音，将切换线性动力系统（SLDS）嵌入到切换自回归隐马尔可夫模型（SAR-HMM）中比较普遍。

著录项

来源
《EURASIP journal on audio, speech, and music processing》 |2009年第1期|共17页
作者
Bj?rn Schuller; Martin W?llmer; Tobias Moosmayr; Gerhard Rigoll;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类无线电电子学、电信技术;
关键词

相似文献

外文文献
中文文献
专利

1. Recognition of Noisy Speech: A Comparative Survey of Robust Model Architecture and Feature Enhancement [J] . Bjoern Schuller, Martin Woellmer, Tobias Moosmayr, EURASIP journal on audio, speech, and music processing . 2009,第009期

机译：语音识别：鲁棒模型架构和功能增强的比较调查
2. Robust Speech Recognition System Using Conventional and Hybrid Features of MFCC, LPCC, PLP, RASTA-PLP and Hidden Markov Model Classifier in Noisy Conditions [J] . Veton Z. K?puska, Hussien A. Elharati Journal of Computer and Communications . 2015,第6期

机译：噪声条件下使用MFCC，LPCC，PLP，RASTA-PLP和隐马尔可夫模型分类器的常规和混合特征的鲁棒语音识别系统
3. A STATISTICAL ANALYSIS ON THE IMPACT OF SPEECH ENHANCEMENT TECHNIQUES ON THE FEATURE VECTORS OF NOISY SPEECH SIGNALS FOR SPEECH RECOGNITION [J] . SWAPNANIL GOGOI, UTPAL BHATTACHARJEE Journal of computer science engineering and information technology research . 2016,第3期

机译：语音增强技术对语音识别中嘈杂语音信号特征向量影响的统计分析
4. A Comparative Analysis of Noise Robust Speech Features Extracted from All-pass based Warping with MFCC in a Noisy Phoneme Recognition [C] . R. Muralishankar, Douglas OShaughnessy International Conference on Digital Telecommunications . 2008

机译：噪声强大的语音特征对嘈杂的音素识别中的MFCC中的全传递翘曲致血分电的比较分析
5. Evaluation of speech enhancement techniques for speaker recognition in noisy environments. [D] . El-Solh, Abdel-Aziz. 2006

机译：在嘈杂环境中评估语音增强技术以进行说话人识别。
6. Cascaded Convolutional Neural Network Architecture for Speech Emotion Recognition in Noisy Conditions [O] . Youngja Nam, Chankyu Lee 2021

机译：级联卷积神经网络架构用于嘈杂的条件下的语音情感识别
7. Recognition of Noisy Speech: A Comparative Survey of Robust Model Architecture and Feature Enhancement [O] . 2009

机译：语音识别：鲁棒模型架构和功能增强的比较调查

Recognition of Noisy Speech: A Comparative Survey of Robust Model Architecture and Feature Enhancement

摘要

著录项

相似文献

相关主题

期刊订阅