A Speech Enhancement System for Automotive Speech Recognition with a Hybrid Voice Activity Detection Method

机译：混合语音活动检测方法的汽车语音识别语音增强系统

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a front-end speech enhancement approach to robust speech recognition in automotive environments. It combines hybrid voice activity detection (VAD), relative transfer function (RT-F) based generalized sidelobe cancelation, and single-channel post filtering to enhance the speech signal of interest, thereby improving the robustness of speech recognition. First, we choose four typical driving scenarios, which include most of the noise types in automobiles to record training data. The recorded data is then used to train deep neural network models (DNNs) for both speech and noise. The trained DNNs are subsequently used to estimate the speech presence probability on a frame-by-frame basis. This speech presence probability is then combined with the output of an energy-based VAD to form a hybrid VAD, which serves as the basis for the rest components of the speech enhancement system, including RTF estimation, adaptive beamforming, and post-filtering. Experiments are conducted in real automotive environments. The results show that the developed method can significantly improve the performance of both VAD and automatic speech recognition (ASR).

机译：本文提出了一种前端语音增强方法，以在汽车环境中实现强大的语音识别。它结合了混合语音活动检测（VAD），基于相对传递函数（RT-F）的广义旁瓣消除和单通道后置滤波，从而增强了所关注的语音信号，从而提高了语音识别的鲁棒性。首先，我们选择四种典型的驾驶场景，其中包括汽车中的大多数噪音类型，以记录训练数据。记录的数据然后用于训练语音和噪声的深度神经网络模型（DNN）。经过训练的DNN随后用于逐帧估计语音存在概率。然后，将这种语音存在概率与基于能量的VAD的输出组合以形成混合VAD，该混合VAD用作语音增强系统其余组件（包括RTF估计，自适应波束成形和后置滤波）的基础。实验是在真实的汽车环境中进行的。结果表明，所开发的方法可以显着提高VAD和自动语音识别（ASR）的性能。

著录项

来源
《International Workshop on Acoustic Signal Enhancement》|2018年|1-9|共9页
会议地点 Tokyo(JP)
作者
Haikun Wang; Zhongfu Ye; Jingdong Chen;
展开▼
作者单位

Department of Electronic Engineering and Information Science University of Science and Technology of China Hefei 230027 China;

Department of Electronic Engineering and Information Science University of Science and Technology of China Hefei 230027 C;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Speech enhancement; Automotive engineering; Microphones; Estimation; Signal to noise ratio; Speech recognition; Acoustics;

机译：语音增强；汽车工程；麦克风；估计；信噪比；语音识别;声学;

相似文献

外文文献
中文文献
专利

1. Speech enhancement through voice activity detection using speech absence probability based on Teager energy [J] . PARKYun-sik, LEE Sang-min 中南大学学报（英文版） . 2013,第002期

机译：通过基于Teager能量的语音缺失概率通过语音活动检测进行语音增强
2. Hidden-Markov-model-based voice activity detector with high speech detection rate for speech enhancement [J] . Veisi H., Sameti H. Signal Processing, IET . 2012,第1期

机译：具有高语音检测率的基于隐马尔可夫模型的语音活动检测器，用于语音增强
3. Voice Activity Detection Using Global Speech Absence Probability Based on Teager Energy for Speech Enhancement [J] . Yun-Sik PARK, Sangmin LEE IEICE transactions on information and systems . 2012,第10期

机译：使用基于Teager能量的全局语音缺席概率进行语音活动检测以增强语音
4. A Speech Enhancement System for Automotive Speech Recognition with a Hybrid Voice Activity Detection Method [C] . Haikun Wang, Zhongfu Ye, Jingdong Chen International Workshop on Acoustic Signal Enhancement . 2018

机译：具有混合语音活动检测方法的汽车语音识别语音增强系统
5. Advances in Audiovisual Speech Processing for Robust Voice Activity Detection and Automatic Speech Recognition [D] . Tao, Fei. 2018

机译：用于鲁棒语音活动检测和自动语音识别的视听语音处理方面的进展
6. A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion [O] . Othman Lachhab, Joseph Di Martino, Elhassane Ibn Elhaj, -1

机译：基于统计语音转换的混合系统改善食道语音识别的初步研究
7. Can audio-visual speech recognition outperform acoustically enhanced speech recognition in automotive environment? [O] . Navarathna Rajitha, Kleinschmidt Tristan, Dean David B., 2011

机译：在汽车环境中，视听语音识别能否优于声学上增强的语音识别？

A Speech Enhancement System for Automotive Speech Recognition with a Hybrid Voice Activity Detection Method

摘要

著录项

相似文献

相关主题

期刊订阅