...
首页> 外文期刊>Multimedia Tools and Applications >Improved vowel region detection from a continuous speech using post processing of vowel onset points and vowel end-points
【24h】

Improved vowel region detection from a continuous speech using post processing of vowel onset points and vowel end-points

机译:使用元音起始点和元音端点的后处理,从连续语音中改善元音区域检测

获取原文
获取原文并翻译 | 示例

摘要

Vowels are produced with an open configuration of the vocal tract, without any audible friction. The acoustic signal is relatively loud with varying strength of impulse-like excitation. Vowels possess significant energy content in the low-frequency bands of the speech signal. Acoustic events such as vowel onset point (VOP) and vowel end-point (VEP) can be used as landmarks to detect vowel regions in a speech signal. In this paper, a two-stage algorithm is proposed to detect precise vowel regions. In the first level, the speech signal is processed using zero frequency filtering to emphasize energy content in low-frequency bands of speech. Zero frequency filtered signal predominantly contains low-frequency content of the speech signal as it is filtered around 0 Hz. This process is followed by the extraction of dominant spectral peaks from the magnitude spectrum around glottal closure regions of the speech signal. The vowel onset points and vowel end-points are obtained by convolving the enhanced spectral contour of zero frequency filtered signal with first order Gaussian differentiator. In the next level, a post-processing is carried out in the regions around VOP and VEP to remove spurious vowel regions based on uniformity of epoch intervals. In addition, the positions of VOPs and VEPs are also corrected using the strength of the excitation of the speech signal. The performance of the proposed vowel region detection method is compared with the existing state of art methods on TIMIT acoustic-phonetic speech corpus. It is reported that this method produced significant improvement in vowel region detection in clean and noisy environments.
机译:元音以声道的开放结构制作,没有任何听得见的摩擦。声信号相对较大,具有类似脉冲激励的强度。元音在语音信号的低频带中具有显着的能量含量。诸如元音起始点(VOP)和元音端点(VEP)之类的声音事件可以用作界标来检测语音信号中的元音区域。本文提出了一种两阶段算法来检测精确的元音区域。在第一级中,使用零频率滤波来处理语音信号,以强调语音低频带中的能量含量。零频滤波信号主要包含语音信号的低频内容,因为它在0 Hz附近被滤波。在此过程之后,从语音信号的声门闭合区域周围的幅度谱中提取主要谱峰。通过将零频率滤波信号的增强频谱轮廓与一阶高斯微分器卷积,可以获得元音起始点和元音端点。在下一级别中,在VOP和VEP周围的区域中进行后处理,以根据时间间隔的均匀性除去伪元音区域。另外,还利用语音信号的激励强度来校正VOP和VEP的位置。将本文提出的元音区域检测方法的性能与TIMIT语音语音语料库上现有的方法进行了比较。据报道,该方法在干净和嘈杂的环境中对元音区域检测产生了显着的改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号