首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Robustness to speaker position in distant-talking automatic speech recognition
【24h】

Robustness to speaker position in distant-talking automatic speech recognition

机译:远距离自动语音识别中说话人位置的稳健性

获取原文

摘要

In this paper, we show a method that significantly improved our previous work in single-channel dereverberation. The proposed method is more robust to changes in speaker position in distanttalking ASR. First, we update the room transfer function (RTF) and weighting parameters for dereverberation to the target speaker position. This scheme corrects speech power variation as a function of position in the waveform level. Consequently, its impact to the acoustic model is verified. Then, we implement a fast acoustic model update reflective of the speech power level of the target speaker position. Furthermore, the scheme in updating the model is simple and precludes time-consuming model re-estimation. As a result, the proposed method can be executed online. The synergy of these corrective measures significantly minimizes the mismatch between training and testing conditions. We test our method using real reverberant data with different locations inside the room. Experimental results show that the proposed method outperforms the conventional methods in terms of ASR performance. Moreover, our fast acoustic model update scheme is at par in terms of recognition performance against time-consuming model re-estimation.
机译:在本文中,我们展示了一种可以显着改善我们先前在单通道去混响方面的工作的方法。所提出的方法对于在远程ASR中说话者位置的改变更加鲁棒。首先,我们更新房间传递函数(RTF)和权重参数以将混响去除到目标扬声器位置。该方案根据波形水平中的位置来校正语音功率变化。因此,验证了其对声学模型的影响。然后,我们实现了反映目标扬声器位置的语音功率水平的快速声学模型更新。此外,用于更新模型的方案很简单,并且避免了耗时的模型重新估计。结果,所提出的方法可以在线执行。这些纠正措施的协同作用可最大程度地减少训练和测试条件之间的不匹配。我们使用房间内不同位置的真实混响数据测试我们的方法。实验结果表明,该方法在ASR性能方面优于传统方法。而且,我们的快速声学模型更新方案在识别性能和耗时的模型重新估计方面均处于同等水平。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号