首页> 外文期刊>電子情報通信学会技術研究報告 >Robust Distant Speech Recognition by Combining Variable-trem spectrum Based Position-dependent CMN with Conventional CMN
【24h】

Robust Distant Speech Recognition by Combining Variable-trem spectrum Based Position-dependent CMN with Conventional CMN

机译:结合基于可变频谱的位置相关CMN和常规CMN的鲁棒远程语音识别

获取原文
获取原文并翻译 | 示例
       

摘要

In a distant-talking environment, the length of channel impulse response is longer than the short-term spectral analysis window. Therefore, the conventional short-term spectrum based Cepstral Mean Normalization (CMN) is not effective under these conditions. In this paper, we propose a robust distant speech recognition method by combining a short-term spectrum based CMN with a long-term one. We assume that a static speech segment (such as a vowel, for example) affected by reverberation can be modeled by a long-term cepstral analysis. Thus, the effect of long reverberation on a static speech segment may be compensated by the long-term spectrum based CMN. In this paper, the concept of combining short-term and long-term spectrum based CMN is extended to an environmentally robust speech recognition method based on Position-Dependent CMN (PDCMN). We call this Variable Term spectrum based PDCMN (VT-PDCMN). Since PDCMN/VT-PDCMN cannot normalize speaker variations, we also combine PDCMN/VT-PDCMN with conventional CMN in this study. We conducted the experiments based on our proposed method using limited vocabulary (100 words) distant-talking isolated word recognition in a real environment. The proposed method achieved a relative error reduction rate of 60.9% over the conventional short-term spectrum based CMN and 30.6% over the short-term spectrum based PDCMN.
机译:在远距离对话环境中,信道脉冲响应的长度比短期频谱分析窗口的长。因此,传统的基于短期频谱的倒谱均值归一化(CMN)在这些条件下无效。在本文中,我们结合了基于短期频谱的CMN和长期CMN提出了一种鲁棒的远距离语音识别方法。我们假设可以通过长期倒频谱分析来模拟受混响影响的静态语音片段(例如元音)。因此,可以通过基于长期频谱的CMN来补偿长混响对静态语音段的影响。在本文中,将基于短期和长期频谱的CMN相结合的概念被扩展到基于位置相关CMN(PDCMN)的对环境鲁棒的语音识别方法。我们称这种基于可变项频谱的PDCMN(VT-PDCMN)。由于PDCMN / VT-PDCMN无法归一化说话人变化,因此在本研究中,我们还将PDCMN / VT-PDCMN与常规CMN相结合。我们基于提出的方法,在实际环境中使用有限词汇(100个单词)远距离交谈孤立单词识别进行了实验。相对于传统的基于短期频谱的CMN,该方法的相对误差降低率为60.9%,而基于短期频谱的PDCMN则为30.6%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号