首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Foreground Speech Segmentation and Enhancement Using Glottal Closure Instants and Mel Cepstral Coefficients
【24h】

Foreground Speech Segmentation and Enhancement Using Glottal Closure Instants and Mel Cepstral Coefficients

机译:使用声门关闭瞬间和梅尔倒谱系数进行前景语音分割和增强

获取原文
获取原文并翻译 | 示例

摘要

In this paper, the speech signal recorded from the desired speaker close to microphone in natural environment is regarded as and rest of the interfering sources as . The proposed paper exploits speech production features like glottal closure instants in time domain and vocal tract information in spectral domain to segment the desired speaker's speech and to further enhance it. The foreground speech is perceptually enhanced using the auditory perception feature in mel-frequency domain using mel-cepstral coefficients and its inversion using mel log spectrum approximation filter. The focus is on enhancing the production and perceptual features of foreground speech rather than relying on modeling the interfering sources. The speech data are collected in different natural environments from different speakers in order to evaluate the proposed method. The enhanced speech signals derived at three different stages of the proposed method are evaluated with state-of-the-art methods in terms of subjective and objective measures. The proposed method provides improved performance compared to the considered state-of-the-art methods. In terms of the proposed objective measure , the enhancement approach presented in this paper gives an average improvement of 12 dB as opposed to existing spectral subtraction-based method which provides 3 dB. Moreover, subjective evaluation using 24 different subjects corroborates the objective test results.
机译:在本文中,从自然环境中靠近麦克风的所需扬声器记录的语音信号被视为,其余干扰源被视为。拟议的论文利用语音产生功能,例如时域中的声门闭合瞬间和频谱域中的声道信息,来分割所需说话者的语音并进一步增强语音效果。使用mel倒谱系数,使用mel频域中的听觉感知特性在听觉上增强前景语音,并使用mel log频谱近似滤波器对前景语音进行倒置。重点在于增强前景语音的产生和感知特性,而不是依赖于对干扰源进行建模。为了评估所提出的方法,在不同的自然环境中从不同的说话者那里收集语音数据。在主观和客观测量方面,使用最新方法对在所提出方法的三个不同阶段得出的增强语音信号进行了评估。与所考虑的最新方法相比,所提出的方法提供了改进的性能。在提出的客观措施方面,与现有的基于频谱减法的方法(3 dB)相比,本文提出的增强方法平均提高了12 dB。此外,使用24个不同主题进行的主观评估证实了客观测试结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号