首页> 外文OA文献 >New time-frequency domain pitch estimation methods for speed signals under low levels of SNR
【2h】

New time-frequency domain pitch estimation methods for speed signals under low levels of SNR

机译:低信噪比下速度信号的时频域基音估计新方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The major objective of this research is to develop novel pitch estimation methods capable of handling speech signals in practical situations where only noise-corrupted speech observations are available. With this objective in mind, the estimation task is carried out in two different approaches. In the first approach, the noisy speech observations are directly employed to develop two new time-frequency domain pitch estimation methods. These methods are based on extracting a pitch-harmonic and finding the corresponding harmonic number required for pitch estimation. Considering that voiced speech is the output of a vocal tract system driven by a sequence of pulses separated by the pitch period, in the second approach, instead of using the noisy speech directly for pitch estimation, an excitation-like signal (ELS) is first generated from the noisy speech or its noise- reduced version. In the first approach, at first, a harmonic cosine autocorrelation (HCAC) model of clean speech in terms of its pitch-harmonics is introduced. In order to extract a pitch-harmonic, we propose an optimization technique based on least-squares fitting of the autocorrelation function (ACF) of the noisy speech to the HCAC model. By exploiting the extracted pitch-harmonic along with the fast Fourier transform (FFT) based power spectrum of noisy speech, we then deduce a harmonic measure and a harmonic-to-noise-power ratio (HNPR) to determine the desired harmonic number of the extracted pitch-harmonic. In the proposed optimization, an initial estimate of the pitch-harmonic is obtained from the maximum peak of the smoothed FFT power spectrum. In addition to the HCAC model, where the cross-product terms of different harmonics are neglected, we derive a compact yet accurate harmonic sinusoidal autocorrelation (HSAC) model for clean speech signal. The new HSAC model is then used in the least-squares model-fitting optimization technique to extract a pitch-harmonic. In the second approach, first, we develop a pitch estimation method by using an excitation-like signal (ELS) generated from the noisy speech. To this end, a technique is based on the principle of homomorphic deconvolution is proposed for extracting the vocal-tract system (VTS) parameters from the noisy speech, which are utilized to perform an inverse-filtering of the noisy speech to produce a residual signal (RS). In order to reduce the effect of noise on the RS, a noise-compensation scheme is introduced in the autocorrelation domain. The noise-compensated ACF of the RS is then employed to generate a squared Hilbert envelope (SHE) as the ELS of the voiced speech. With a view to further overcome the adverse effect of noise on the ELS, a new symmetric normalized magnitude difference function of the ELS is proposed for eventual pitch estimation. Cepstrum has been widely used in speech signal processing but has limited capability of handling noise. One potential solution could be the introduction of a noise reduction block prior to pitch estimation based on the conventional cepstrum, a framework already available in many practical applications, such as mobile communication and hearing aids. Motivated by the advantages of the existing framework and considering the superiority of our ELS to the speech itself in providing clues for pitch information, we develop a cepstrum-based pitch estimation method by using the ELS obtained from the noise-reduced speech. For this purpose, we propose a noise subtraction scheme in frequency domain, which takes into account the possible cross-correlation between speech and noise and has advantages of noise being updated with time and adjusted at each frame. The enhanced speech thus obtained is utilized to extract the vocal-tract system (VTS) parameters via the homomorphic deconvolution technique. A residual signal (RS) is then produced by inverse-filtering the enhanced speech with the extracted VTS parameters. It is found that, unlike the previous ELS-based method, the squared Hilbert envelope (SHE) computed from the RS of the enhanced speech without noise compensation, is sufficient to represent an ELS. Finally, in order to tackle the undesirable effect of noise of the ELS at a very low SNR and overcome the limitation of the conventional cepstrum in handling different types of noises, a time-frequency domain pseudo cepstrum of the ELS of the enhanced speech, incorporating information of both magnitude and phase spectra of the ELS, is proposed for pitch estimation. (Abstract shortened by UMI.)
机译:这项研究的主要目的是开发新颖的音调估计方法,该方法能够在只有噪声损坏的语音观察可用的实际情况下处理语音信号。考虑到这一目标,以两种不同的方式执行估算任务。在第一种方法中,将带噪语音观测直接用于开发两种新的时频域音调估计方法。这些方法基于提取音调谐波并找到音调估计所需的相应谐波数。考虑到语音语音是由由音调周期分隔的一系列脉冲驱动的声道系统的输出,在第二种方法中,不是直接将嘈杂的语音用于音调估计,而是首先使用类激励信号(ELS)从嘈杂的语音或其降噪版本中产生。在第一种方法中,首先,引入了基于音调谐波的干净语音的谐波余弦自相关(HCAC)模型。为了提取音调谐波,我们提出了一种基于最小二乘拟合将噪声语音的自相关函数(ACF)应用于HCAC模型的优化技术。通过利用提取的基音谐波和基于快速傅里叶变换(FFT)的带噪语音功率谱,我们推导了谐波测量和谐波噪声功率比(HNPR),以确定期望的谐波次数。提取音高谐波。在提出的优化中,从平滑的FFT功率谱的最大峰值获得音调谐波的初始估计。除了HCAC模型(其中忽略了不同谐波的叉积项)以外,我们还导出了一个紧凑而准确的用于干净语音信号的谐波正弦自相关(HSAC)模型。然后,将新的HSAC模型用于最小二乘模型拟合优化技术中,以提取音调谐波。在第二种方法中,首先,我们通过使用从嘈杂语音中产生的类似激励信号(ELS)来开发音调估计方法。为此,提出了一种基于同态反卷积原理的技术,用于从嘈杂语音中提取声道系统(VTS)参数,该参数用于对嘈杂语音进行逆滤波以产生残留信号。 (RS)。为了减少噪声对RS的影响,在自相关域中引入了噪声补偿方案。然后,使用RS的经过噪声补偿的ACF来生成平方的希尔伯特包络(SHE),作为有声语音的ELS。为了进一步克服噪声对ELS的不利影响,提出了ELS的新对称归一化幅度差函数用于最终的音高估计。倒谱已经广泛用于语音信号处理中,但是处理噪声的能力有限。一种可能的解决方案是在基于常规倒谱的音高估计之前引入降噪模块,该倒谱是一种在许多实际应用中已经可用的框架,例如移动通信和助听器。基于现有框架的优势,并考虑到ELS在提供音调信息线索方面相对于语音本身的优势,我们使用从降噪语音获得的ELS开发了基于倒谱的音调估计方法。为此,我们提出了一种频域的噪声减法方案,该方案考虑了语音和噪声之间可能的互相关性,并具有噪声随时间更新并在每帧进行调整的优点。如此获得的增强语音用于通过同态反卷积技术提取声道系统(VTS)参数。然后通过使用提取的VTS参数对增强语音进行逆滤波来生成残留信号(RS)。已发现,与先前的基于ELS的方法不同,从增强型语音的RS计算出的平方希尔伯特包络(SHE)无需噪声补偿,足以表示ELS。最后,为了解决非常低SNR时ELS噪声的不良影响并克服传统倒谱在处理不同类型噪声中的局限性,增强语音ELS的时频伪伪倒谱提出了ELS的幅度和相位谱的信息,用于音调估计。 (摘要由UMI缩短。)

著录项

  • 作者

    Shahnaz Celia;

  • 作者单位
  • 年度 2009
  • 总页数
  • 原文格式 PDF
  • 正文语种 en
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号