首页> 外文会议>International Workshop on Acoustic Signal Enhancement >A Speech Enhancement System for Automotive Speech Recognition with a Hybrid Voice Activity Detection Method
【24h】

A Speech Enhancement System for Automotive Speech Recognition with a Hybrid Voice Activity Detection Method

机译:具有混合语音活动检测方法的汽车语音识别语音增强系统

获取原文

摘要

This paper presents a front-end speech enhancement approach to robust speech recognition in automotive environments. It combines hybrid voice activity detection (VAD), relative transfer function (RT-F) based generalized sidelobe cancelation, and single-channel post filtering to enhance the speech signal of interest, thereby improving the robustness of speech recognition. First, we choose four typical driving scenarios, which include most of the noise types in automobiles to record training data. The recorded data is then used to train deep neural network models (DNNs) for both speech and noise. The trained DNNs are subsequently used to estimate the speech presence probability on a frame-by-frame basis. This speech presence probability is then combined with the output of an energy-based VAD to form a hybrid VAD, which serves as the basis for the rest components of the speech enhancement system, including RTF estimation, adaptive beamforming, and post-filtering. Experiments are conducted in real automotive environments. The results show that the developed method can significantly improve the performance of both VAD and automatic speech recognition (ASR).
机译:本文介绍了汽车环境中强大语音识别的前端语音增强方法。它结合了混合语音活动检测(VAD),相对传递函数(RT-F)的广义旁观镜片消除,以及单通道后滤波以增强感兴趣的语音信号,从而提高语音识别的鲁棒性。首先,我们选择四种典型的驾驶场景,其中包括汽车中的大多数噪声类型来记录训练数据。然后,记录的数据用于培训深度神经网络模型(DNN)以进行语音和噪声。随后使用训练的DNN在逐帧上估计语音存在概率。然后将该语音存在概率与基于能量的VAD的输出组合以形成混合VAD,其用作语音增强系统的其余组件的基础,包括RTF估计,自适应波束形成和过滤。实验是在真实的汽车环境中进行的。结果表明,开发方法可以显着提高VAD和自动语音识别(ASR)的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号