首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Detection of Glottal Closure Instants From Speech Signals: A Quantitative Review
【24h】

Detection of Glottal Closure Instants From Speech Signals: A Quantitative Review

机译:从语音信号检测声门闭合的瞬间:定量审查。

获取原文
获取原文并翻译 | 示例

摘要

The pseudo-periodicity of voiced speech can be exploited in several speech processing applications. This requires however that the precise locations of the glottal closure instants (GCIs) are available. The focus of this paper is the evaluation of automatic methods for the detection of GCIs directly from the speech waveform. Five state-of-the-art GCI detection algorithms are compared using six different databases with contemporaneous electroglottographic recordings as ground truth, and containing many hours of speech by multiple speakers. The five techniques compared are the Hilbert Envelope-based detection (HE), the Zero Frequency Resonator-based method (ZFR), the Dynamic Programming Phase Slope Algorithm (DYPSA), the Speech Event Detection using the Residual Excitation And a Mean-based Signal (SEDREAMS) and the Yet Another GCI Algorithm (YAGA). The efficacy of these methods is first evaluated on clean speech, both in terms of reliabililty and accuracy. Their robustness to additive noise and to reverberation is also assessed. A further contribution of the paper is the evaluation of their performance on a concrete application of speech processing: the causal-anticausal decomposition of speech. It is shown that for clean speech, SEDREAMS and YAGA are the best performing techniques, both in terms of identification rate and accuracy. ZFR and SEDREAMS also show a superior robustness to additive noise and reverberation.
机译:语音的伪周期性可以在几种语音处理应用程序中得到利用。但是,这需要声门闭合瞬间(GCI)的精确位置可用。本文的重点是直接从语音波形中检测GCI的自动方法的评估。使用六个不同的数据库同时比较了五种最新的GCI检测算法,这些数据库同时具有电声描记记录作为地面实况,并且包含多个讲话者的长时间讲话。比较的五种技术是基于希尔伯特包络的检测(HE),基于零频谐振器的方法(ZFR),动态编程相位斜率算法(DYPSA),使用残差激励和基于均值信号的语音事件检测(SEDREAMS)和另一个GCI算法(YAGA)。首先从可靠性和准确性两方面对这些方法的效果进行评估。还评估了它们对加性噪声和混响的鲁棒性。本文的另一贡献是在语音处理的具体应用上评估了它们的性能:语音的因果-对因分解。结果表明,就识别率和准确性而言,对于纯净语音而言,SEDREAMS和YAGA是表现最佳的技术。 ZFR和SEDREAMS还显示出对加性噪声和混响的出色鲁棒性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号