首页> 外文期刊>IEICE Transactions on Information and Systems >ATR Parallel Decoding Based Speech Recognition System Robust to Noise and Speaking Styles
【24h】

ATR Parallel Decoding Based Speech Recognition System Robust to Noise and Speaking Styles

机译:基于ATR并行解码的语音识别系统对噪声和说话风格均具有鲁棒性

获取原文
获取原文并翻译 | 示例
       

摘要

In this paper, we describe a parallel decoding-based ASR system developed of ATR that is robust to noise type, SNR and speaking style. It is difficult to recognize speech affected by various factors, especially when an ASR system contains only a single acoustic model. One solution is to employ multiple acoustic models, one model for each different condition. Even though the robustness of each acoustic model is limited, the whole ASR system can handle various conditions appropriately. In our system, there are two recognition sub-systems which use different features such as MFCC and Differential MFCC (DMFCC). Each sub-system has several acoustic models depending on SNR, speaker gender and speaking style, and during recognition each acoustic model is adapted by fast noise adaptation. From each sub-system, one hypothesis is selected based on posterior probability. The final recognition result is obtained by combining the best hypotheses from the two sub-systems. On the AURORA-2J task used widely for the evaluation of noise robustness, our system achieved higher recognition performance than a system which contains only a single model. Also, our system was tested using normal and hyper-articulated speech contaminated by several background noises, and exhibited high robustness to noise and speaking styles.
机译:在本文中,我们描述了一种针对ATR开发的基于并行解码的ASR系统,该系统对噪声类型,SNR和说话风格具有鲁棒性。很难识别受各种因素影响的语音,尤其是当ASR系统仅包含一个声学模型时。一种解决方案是采用多个声学模型,每个模型针对每种不同条件。即使每个声学模型的鲁棒性受到限制,整个ASR系统仍可以适当处理各种条件。在我们的系统中,有两个识别子系统,它们使用不同的功能,例如MFCC和差分MFCC(DMFCC)。每个子系统都有多个声学模型,具体取决于SNR,说话者性别和说话风格,并且在识别过程中,每个声学模型都通过快速噪声适应进行了适应。从每个子系统中,根据后验概率选择一个假设。最终的识别结果是通过结合两个子系统的最佳假设而获得的。在广泛用于评估噪声鲁棒性的AURORA-2J任务上,我们的系统比仅包含一个模型的系统获得了更高的识别性能。此外,我们的系统使用正常的和超清晰的语音进行了测试,这些语音被多种背景噪音所污染,并且对噪音和说话风格表现出很高的鲁棒性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号