首页> 外文会议>Annual conference of the International Speech Communication Association;INTERSPEECH 2010 >Robust Automatic Speech Recognition with Decoder Oriented Ideal Binary Mask Estimation
【24h】

Robust Automatic Speech Recognition with Decoder Oriented Ideal Binary Mask Estimation

机译:面向解码器的理想二进制掩码估计的鲁棒自动语音识别

获取原文

摘要

In this paper, we propose a joint optimal method for automatic speech recognition (ASR) and ideal binary mask (IBM) estimation in transformed into the cepstral domain through a newly derived generalized expectation maximization algorithm. First, cepstral domain missing feature marginalization is established using a linear transformation, after tying the mean and variance of non-existing cepstral coefficients. Second, IBM estimation is formulated using a generalized expectation maximization algorithm directly to optimize the ASR performance. Experimental results show that even in highly non-stationary mismatch condition (dance music as background noise), the proposed method achieves much higher absolute ASR accuracy improvement ranging from 14.69% at 0 dB SNR to 40.10% at 15 dB SNR compared with the conventional noise suppression method.
机译:在本文中,我们提出了一种通过新推导的广义期望最大化算法将自动语音识别(ASR)和理想二进制掩码(IBM)估计转换为倒谱域的联合最优方法。首先,在将不存在的倒谱系数的均值和方差绑起来后,使用线性变换建立倒谱域缺失特征边缘化。其次,直接使用广义期望最大化算法制定IBM估计,以优化ASR性能。实验结果表明,即使在高度不稳定的失配条件下(舞蹈音乐作为背景噪声),与传统噪声相比,该方法也可以实现更高的绝对ASR精度提高,从0 dB SNR时的14.69%到15 dB SNR时的40.10%抑制方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号