首页> 外文会议>International Workshop on Acoustic Signal Enhancement >A Speech Enhancement System for Automotive Speech Recognition with a Hybrid Voice Activity Detection Method
【24h】

A Speech Enhancement System for Automotive Speech Recognition with a Hybrid Voice Activity Detection Method

机译:混合语音活动检测方法的汽车语音识别语音增强系统

获取原文

摘要

This paper presents a front-end speech enhancement approach to robust speech recognition in automotive environments. It combines hybrid voice activity detection (VAD), relative transfer function (RT-F) based generalized sidelobe cancelation, and single-channel post filtering to enhance the speech signal of interest, thereby improving the robustness of speech recognition. First, we choose four typical driving scenarios, which include most of the noise types in automobiles to record training data. The recorded data is then used to train deep neural network models (DNNs) for both speech and noise. The trained DNNs are subsequently used to estimate the speech presence probability on a frame-by-frame basis. This speech presence probability is then combined with the output of an energy-based VAD to form a hybrid VAD, which serves as the basis for the rest components of the speech enhancement system, including RTF estimation, adaptive beamforming, and post-filtering. Experiments are conducted in real automotive environments. The results show that the developed method can significantly improve the performance of both VAD and automatic speech recognition (ASR).
机译:本文提出了一种前端语音增强方法,以在汽车环境中实现强大的语音识别。它结合了混合语音活动检测(VAD),基于相对传递函数(RT-F)的广义旁瓣消除和单通道后置滤波,从而增强了所关注的语音信号,从而提高了语音识别的鲁棒性。首先,我们选择四种典型的驾驶场景,其中包括汽车中的大多数噪音类型,以记录训练数据。记录的数据然后用于训练语音和噪声的深度神经网络模型(DNN)。经过训练的DNN随后用于逐帧估计语音存在概率。然后,将这种语音存在概率与基于能量的VAD的输出组合以形成混合VAD,该混合VAD用作语音增强系统其余组件(包括RTF估计,自适应波束成形和后置滤波)的基础。实验是在真实的汽车环境中进行的。结果表明,所开发的方法可以显着提高VAD和自动语音识别(ASR)的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号