首页> 外文OA文献 >Fundamental Frequency and Direction-of-Arrival Estimation for Multichannel Speech Enhancement
【2h】

Fundamental Frequency and Direction-of-Arrival Estimation for Multichannel Speech Enhancement

机译:多声道语音增强的基频和到达方向估计

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Audio systems receive the speech signals of interest usually in the presence of noise. The noise has profound impacts on the quality and intelligibility of the speech signals, and it is therefore clear that the noisy signals must be cleaned up before being played back, stored, or analyzed. We can estimate the speech signal of interest from the noisy signals using a priori knowledge about it. A human speech signal is broadband and consists of both voiced and unvoiced parts. The voiced part is quasi-periodic with a time-varying fundamental frequency (or pitch as it is commonly referred to). We consider the periodic signals basically as the sum of harmonics. Therefore, we can pass the noisy signals through bandpass filters centered at the frequencies of the harmonics to enhance the signal. In addition, although the frequencies of the harmonics are the same across the channels of a microphone array, the multichannel periodic signals may have different phases due to the time-differences-of-arrivals (TDOAs) which are related to the direction-of-arrival (DOA) of the impinging sound waves. Hence, the outputs of the array can be steered to the direction of the signal of interest in order to align their time differences which eventually may further reduce the effects of noise.This thesis introduces a number of principles and methods to estimate periodic signals in noisy environments with application to multichannel speech enhancement. We propose model-based signal enhancement concerning the model of periodic signals. Therefore, the parameters of the model must be estimated in advance. The signal of interest is often contaminated by different types of noise that may render many estimation methods suboptimal due to an incorrect white Gaussian noise assumption. We therefore propose robust estimators against the noise and focus on statistical-based and filtering-based methods by imposing distortionless constraints with explicit relations between the parameters of the harmonics. The estimated fundamental frequencies are expected to be continuous over time. Therefore, we concern the time-varying fundamental frequency in the statistical methods in order to lessen the estimation error. We also propose a maximum likelihood DOA estimator concerning the noise statistics and the linear relationship between the TDOAs of the harmonics. The estimators have benefits compared to the state-of-the-art statistical-based methods in colored noise. Evaluations of the estimators comparing with the minimum variance of the deterministic parameters and the other methods confirm that the proposed estimators are statistically efficient in colored noise and computationally simple. Finally, we propose model-based beamformers in multichannel speech signal enhancement by exploiting the estimated fundamental frequency and DOA of the signal of interest. This general framework is tailored to a number of beamformers concerning the spectral and spatial information of the periodic signals which are quasi-stationary in short intervals. Objective measures of speech quality and ineligibility confirm the advantage of the harmonic model-based beamformers over the traditional beamformers, which are non-parametric, and reveal the importance of an accurate estimate of the parameters of the model.
机译:音频系统通常在存在噪声的情况下接收感兴趣的语音信号。噪声对语音信号的质量和清晰度有深远的影响,因此很明显,在播放,存储或分析噪声信号之前,必须对其进行清理。我们可以使用关于它的先验知识从噪声信号中估计感兴趣的语音信号。人类语音信号是宽带的,由有声和无声部分组成。浊音部分具有随时间变化的基本频率(或通常称为音高)的准周期。我们基本上将周期信号视为谐波之和。因此,我们可以使噪声信号通过以谐波频率为中心的带通滤波器来增强信号。此外,尽管在麦克风阵列的各个通道上谐波的频率相同,但是由于与到达方向有关的到达时间差(TDOA),多通道周期信号可能具有不同的相位。撞击声波的到达(DOA)。因此,可以将阵列的输出转向所关注信号的方向,以便对齐它们的时间差,最终可以进一步减少噪声的影响。本文介绍了一些在噪声较大的情况下估计周期信号的原理和方法。应用于多通道语音增强的环境。我们提出关于周期信号模型的基于模型的信号增强。因此,必须预先估计模型的参数。感兴趣的信号通常被不同类型的噪声污染,由于不正确的高斯白噪声假设,可能会使许多估算方法不理想。因此,我们提出了针对噪声的鲁棒估计器,并通过在谐波参数之间建立明确的关系来施加无失真约束,从而专注于基于统计和基于滤波的方法。估计的基本频率将随时间连续。因此,为了减少估计误差,我们在统计方法中关注时变基频。我们还提出了关于噪声统计和谐波的TDOA之间的线性关系的最大似然DOA估计器。与最新的基于统计的彩色噪声方法相比,估计器具有优势。通过与确定性参数的最小方差和其他方法进行比较,对估计量进行评估,证实了所提出的估计量在彩色噪声方面具有统计上的高效性,并且计算简单。最后,我们通过利用感兴趣信号的估计基频和DOA,提出了基于模型的波束形成器,用于多通道语音信号增强。该通用框架针对大量波束形成器而定制,这些波束形成器涉及在短间隔内准平稳的周期性信号的频谱和空间信息。语音质量和不合格性的客观测量结果证实了基于谐波模型的波束形成器优于非参数化传统波束形成器的优势,并揭示了准确估计模型参数的重要性。

著录项

  • 作者

    Karimian-Azari Sam;

  • 作者单位
  • 年度 2016
  • 总页数
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号