首页> 外文会议>Conference of the International Speech Communication Association >Concurrent processing of voice activity detection and noise reduction using empirical mode decomposition and modulation spectrum analysis
【24h】

Concurrent processing of voice activity detection and noise reduction using empirical mode decomposition and modulation spectrum analysis

机译:使用经验模式分解和调制频谱分析的语音活动检测和降噪同时处理语音活动检测和降噪

获取原文

摘要

Voice activity detection (VAD) is mainly used to detect speech/non-speech periods in observed noisy signals. The detected periods are used to reduce noise components or enhance speech components in noisy speech. However, current VAD techniques have serious problems in that the accuracy of detection of speech/non-speech periods drastically reduces if they are used for noisy speech and/or for mixtures of non-speech such as those in musical and environmental sounds. Thus, VAD needs to be robust to enable speech periods to be accurately detected in these situations. This paper proposes concurrent processing of VAD and noise reduction (NR) using empirical mode decomposition (EMD) and modulation spectrum analysis (MSA) to simultaneously resolve these problems. The proposed method effectively works on reducing stationary background noise by using EMD without estimating SNR (noise conditions), and then on reducing non-stationary noise including non-speech components by using MSA while this is determining speech/non-speech periods by thresholding the noise-reduced speech. Three experiments on VAD/NR in real environments were conducted to evaluate the proposed method by comparing it with typical methods (Otsu's method, G.729B, and AMR) and our previous methods. The results demonstrated that the proposed method could accurately detect speech/non-speech periods and effectively reduce noise components simultaneously.
机译:语音活动检测(VAD)主要用于检测观察到的嘈杂信号中的语音/非语音时段。检测到的时段用于减少噪声分量或增强嘈杂的语音中的语音组件。然而,当前的VAD技术具有严重问题,因为如果它们用于嘈杂的语音和/或诸如音乐和环境声音中的非语言的混合物,则语音/非语音周期的检测准确性大大降低。因此,VAD需要稳健地使能在这些情况下能够精确地检测语音周期。本文建议使用经验模式分解(EMD)和调制频谱分析(MSA)来同时处理VAD和降噪(NR),以同时解决这些问题。该方法有效地通过使用EMD来减少静止背景噪声,而无需估计SNR(噪声条件),然后通过使用MSA减少包括非语音组件的非静止噪声,而通过阈值来确定语音/非语音周期降噪语音。通过将其与典型方法(OTSU方法,G.729B和AMR)与我们之前的方法进行比较,进行了在真实环境中的VAD / NR上的三个实验进行评估。结果表明,所提出的方法可以准确地检测语音/非语音周期,并同时有效地降低噪声分量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号