Bispectra Analysis-Based VAD for Robust Speech Recognition

机译：基于双谱分析的VAD用于可靠的语音识别

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

A robust and effective voice activity detection (VAD) algorithm is proposed for improving speech recognition performance in noisy environments. The approach is based on filtering the input channel to avoid high energy noisy components and then the determination of the speech/non-speech bispectra by means of third order auto-cumulants. This algorithm differs from many others in the way the decision rule is formulated (detection tests) and the domain used in this approach. Clear improvements in speech/non-speech discrimination accuracy demonstrate the effectiveness of the proposed VAD. It is shown that application of statistical detection test leads to a better separation of the speech and noise distributions, thus allowing a more effective discrimination and a tradeoff between complexity and performance. The algorithm also incorporates a previous noise reduction block improving the accuracy in detecting speech and non-speech. The experimental analysis carried out on the AURORA databases and tasks provides an extensive performance evaluation together with an exhaustive comparison to the standard VADs such as ITU G.729, GSM AMR and ETSI AFE for distributed speech recognition (DSR), and other recently reported VADs.

机译：提出了一种鲁棒有效的语音活动检测（VAD）算法，以提高嘈杂环境中的语音识别性能。该方法基于对输入通道进行滤波以避免高能量噪声成分，然后基于三阶自动累积量确定语音/非语音双谱。该算法与其他许多算法的不同之处在于制定决策规则（检测测试）的方式以及此方法中使用的域。语音/非语音辨别准确性的明显提高证明了所建议的VAD的有效性。结果表明，统计检测测试的应用可以更好地分离语音和噪声分布，从而可以更有效地进行区分，并在复杂度和性能之间进行权衡。该算法还结合了先前的降噪模块，从而提高了检测语音和非语音的准确性。在AURORA数据库和任务上进行的实验分析提供了广泛的性能评估，并且与标准VAD（例如用于分布式语音识别（DSR）的ITU G.729，GSM AMR和ETSI AFE和其他最近报告的VAD）进行了详尽的比较。。

著录项

来源
《》|2005年|P.567-576|共10页
会议地点 Las Palmas(ES)
作者
J.M. Gorriz; C.G. Puntonet; J. Ramirez; J.C. Segura;
展开▼
作者单位

E.T.S.I.I., Universidad de Granada, C/Periodista Daniel Saucedo, 18071 Granada, Spain;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. An effective subband OSF-based VAD with noise reduction for robust speech recognition [J] . Ramirez J., Segura J.C., Benitez C., IEEE Transactions on Speech and Audio Proceessing . 2005,第6期

机译：有效的基于子带OSF的VAD，具有降噪功能，可实现强大的语音识别
2. Combination of GMM-Based Speech Estimation Method and Temporal Domain SVD-Based Speech Enhancement for Noise Robust Speech Recognition [J] . Masakiyo Fujimoto, Yasuo Ariki Systems and Computers in Japan . 2007,第3期

机译：基于GMM的语音估计方法与基于时域SVD的语音增强相结合的噪声鲁棒语音识别
3. Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition [J] . Shimada Kazuki, Bando Yoshiaki, Mimura Masato, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2019,第5期

机译：基于多通道NMF信息波束形成的无监督语音增强技术，用于强噪声自动语音识别
4. Bispectra Analysis-Based VAD for Robust Speech Recognition [C] . J.M. Gorriz, C.G. Puntonet, J. Ramirez, International Work-Conference on the Interplay Between Natural and Artificial Computation . 2005

机译：基于BISPectra分析的VAD用于强大的语音识别
5. Robust speech processing based on microphone array, audio-visual, and frame selection for in-vehicle speech recognition and in-set speaker recognition. [D] . Zhang, Xianxian. 2005

机译：基于麦克风阵列，视听和帧选择的强大语音处理功能，可实现车载语音识别和内置说话人识别。
6. New Features Using Robust MVDR Spectrum of Filtered Autocorrelation Sequence for Robust Speech Recognition [O] . Sanaz Seyedin, Seyed Mohammad Ahadi, Saeed Gazor 2013

机译：使用滤波自相关序列的鲁棒MVDR频谱进行鲁棒语音识别的新功能
7. A multi-channel speech enhancement framework for robust NMF-based speech recognition for speech-impaired users [O] . Dekkers Gert, van Waterschoot Toon, Vanrumste Bart, 2015

机译：一个多通道语音增强框架，可为语音受损用户提供基于健壮的基于NMF的语音识别功能
8. Robust Speech Processing & Recognition: Speaker ID, Language ID, Speech Recognition/Keyword Spotting, Diarization/Co-Channel/Environmental Characterization, Speaker State Assessment. [R] . Hansen, J. H. 2015

机译：强大的语音处理和识别：说话者ID，语言ID，语音识别/关键字识别，Diarization / Co-Channel /环境表征，说话者状态评估。

Bispectra Analysis-Based VAD for Robust Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅