首页> 外文期刊>IEEE transactions on audio, speech and language processing >Mask estimation for missing data speech recognition based on statistics of binaural interaction
【24h】

Mask estimation for missing data speech recognition based on statistics of binaural interaction

机译:基于双耳互动统计的漏失数据语音识别模板估计

获取原文
获取原文并翻译 | 示例
           

摘要

This paper describes a perceptually motivated computational auditory scene analysis (CASA) system that combines sound separation according to spatial location with the "missing data" approach for robust speech recognition in noise. Missing data time-frequency masks are created using probability distributions based on estimates of interaural time and level differences (ITD and ILD) for mixed utterances in reverberated conditions; these masks indicate which regions of the spectrum constitute reliable evidence of the target speech signal. A number of experiments compare the relative efficacy of the binaural cues when used individually and in combination. We also investigate the ability of the system to generalize to acoustic conditions not encountered during training. Performance on a continuous digit recognition task using this method is found to be good, even in a particularly challenging environment with three concurrent male talkers.
机译:本文介绍了一种感知动机的计算听觉场景分析(CASA)系统,该系统将根据空间位置的声音分离与“缺失数据”方法相结合,以在噪声中进行可靠的语音识别。基于混响条件下混合话语的听觉时间和水平差(ITD和ILD)的估计,使用概率分布来创建丢失的数据时频掩码;这些掩码指示频谱的哪些区域构成目标语音信号的可靠证据。大量实验比较了单独使用或组合使用时双耳提示的相对功效。我们还研究了该系统推广到训练期间未遇到的声学条件的能力。发现即使使用三个并发男性讲话者的特别具有挑战性的环境,使用此方法在连续数字识别任务上的表现也很好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号