首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Multi-Microphone Speech Dereverberation and Noise Reduction Using Relative Early Transfer Functions
【24h】

Multi-Microphone Speech Dereverberation and Noise Reduction Using Relative Early Transfer Functions

机译:使用相对早期传递函数的多麦克风语音混响和降噪

获取原文
获取原文并翻译 | 示例

摘要

In speech communication systems, the microphone signals are degraded by reverberation and ambient noise. The reverberant speech can be separated into two components, namely, an early speech component that includes the direct path and some early reflections, and a late reverberant component that includes all the late reflections. In this paper, a novel algorithm to simultaneously suppress early reflections, late reverberation and ambient noise is presented. A multi-microphone minimum mean square error estimator is used to obtain a spatially filtered version of the early speech component. The estimator constructed as a minimum variance distortionless response (MVDR) beamformer (BF) followed by a postfilter (PF). Three unique design features characterize the proposed method. First, the MVDR BF is implemented in a special structure, named the nonorthogonal generalized sidelobe canceller (NO-GSC). Compared with the more conventional orthogonal GSC structure, the new structure allows for a simpler implementation of the GSC blocks for various MVDR constraints. Second, In contrast to earlier works, RETFs are used in the MVDR criterion rather than either the entire RTFs or only the direct-path of the desired speech signal. An estimator of the RETFs is proposed as well. Third, the late reverberation and noise are processed by both the beamforming stage and the PF stage. Since the relative power of the noise and the late reverberation varies with the frame index, a computationally efficient method for the required matrix inversion is proposed to circumvent the cumbersome mathematical operation. The algorithm was evaluated and compared with two alternative multichannel algorithms and one single-channel algorithm using simulated data and data recorded in a room with a reverberation time of 0.5 s for various source-microphone array distances (1-4 m) and several signal-to-noise levels. The processed signals were tested using two commonly used objective measures, n- mely perceptual evaluation of speech quality and log-spectral distance. As an additional objective measure, the improvement in word accuracy percentage of an acoustic speech recognition system is also demonstrated.
机译:在语音通信系统中,麦克风信号会因混响和环境噪声而下降。混响语音可以分为两个部分,即,包含直接路径和一些早期反射的早期语音成分,以及包含所有晚期反射的晚期混响成分。本文提出了一种同时抑制早期反射,晚期混响和环境噪声的新算法。多麦克风最小均方误差估计器用于获得早期语音分量的空间滤波版本。估计器构造为最小方差无失真响应(MVDR)波束形成器(BF),后跟后置滤波器(PF)。该方法具有三个独特的设计特征。首先,MVDR BF以特殊的结构实现,称为非正交广义旁瓣抵消器(NO-GSC)。与更常规的正交GSC结构相比,新结构允许针对各种MVDR约束更简单地实现GSC块。其次,与早期工作相反,在MVDR标准中使用RETF而不是整个RTF或仅使用所需语音信号的直接路径。还建议了RETF的估算器。第三,后期混响和噪声由波束成形阶段和PF阶段处理。由于噪声和后期混响的相对功率随帧索引而变化,因此提出了一种计算有效的矩阵求逆方法,以避开繁琐的数学运算。对算法进行了评估,并使用模拟数据和在房间中记录的数据与两种替代的多通道算法和一种单通道算法进行了比较,混响时间为0.5秒(对于各种源麦克风阵列距离(1-4 m)和几个信号源)噪声水平。使用两种常用的客观测量对处理后的信号进行测试,即语音质量和对数频谱距离的仅感知感知评估。作为附加的客观措施,还证明了语音识别系统的单词准确率的提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号