...
首页> 外文期刊>EURASIP journal on audio, speech, and music processing >Recognition of Noisy Speech: A Comparative Survey of Robust Model Architecture and Feature Enhancement
【24h】

Recognition of Noisy Speech: A Comparative Survey of Robust Model Architecture and Feature Enhancement

机译:语音识别:鲁棒模型架构和功能增强的比较调查

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Performance of speech recognition systems strongly degrades in the presence of background noise, like the driving noise inside a car. In contrast to existing works, we aim to improve noise robustness focusing on all major levels of speech recognition: feature extraction, feature enhancement, speech modelling, and training. Thereby, we give an overview of promising auditory modelling concepts, speech enhancement techniques, training strategies, and model architecture, which are implemented in an in-car digit and spelling recognition task considering noises produced by various car types and driving conditions. We prove that joint speech and noise modelling with a Switching Linear Dynamic Model (SLDM) outperforms speech enhancement techniques like Histogram Equalisation (HEQ) with a mean relative error reduction of 52.7% over various noise types and levels. Embedding a Switching Linear Dynamical System (SLDS) into a Switching Autoregressive Hidden Markov Model (SAR-HMM) prevails for speech disturbed by additive white Gaussian noise.
机译:语音识别系统的性能在存在背景噪音(例如汽车内的行驶噪音)的情况下会大大降低。与现有作品相比,我们旨在提高语音识别的所有主要级别上的噪声鲁棒性:特征提取,特征增强,语音建模和训练。因此,我们对有前途的听觉建模概念,语音增强技术,训练策略和模型体系结构进行了概述,这些概念在考虑到各种汽车类型和驾驶条件产生的噪音的车内数字和拼写识别任务中实现。我们证明,使用切换线性动态模型(SLDM)的联合语音和噪声建模优于直方图均衡(HEQ)等语音增强技术,在各种噪声类型和级别上的平均相对误差减少了52.7%。对于受加性高斯白噪声干扰的语音,将切换线性动力系统(SLDS)嵌入到切换自回归隐马尔可夫模型(SAR-HMM)中比较普遍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号