...
首页> 外文期刊>IEICE transactions on information and systems >A Hybrid Approach to Electrolaryngeal Speech Enhancement Based on Noise Reduction and Statistical Excitation Generation
【24h】

A Hybrid Approach to Electrolaryngeal Speech Enhancement Based on Noise Reduction and Statistical Excitation Generation

机译:基于降噪和统计激励生成的电喉语音增强混合方法

获取原文
   

获取外文期刊封面封底 >>

       

摘要

This paper presents an electrolaryngeal (EL) speech enhancement method capable of significantly improving naturalness of EL speech while causing no degradation in its intelligibility. An electrolarynx is an external device that artificially generates excitation sounds to enable laryngectomees to produce EL speech. Although proficient laryngectomees can produce quite intelligible EL speech, it sounds very unnatural due to the mechanical excitation produced by the device. Moreover, the excitation sounds produced by the device often leak outside, adding to EL speech as noise. To address these issues, there are mainly two conventional approached to EL speech enhancement through either noise reduction or statistical voice conversion (VC). The former approach usually causes no degradation in intelligibility but yields only small improvements in naturalness as the mechanical excitation sounds remain essentially unchanged. On the other hand, the latter approach significantly improves naturalness of EL speech using spectral and excitation parameters of natural voices converted from acoustic parameters of EL speech, but it usually causes degradation in intelligibility owing to errors in conversion. We propose a hybrid approach using a noise reduction method for enhancing spectral parameters and statistical voice conversion method for predicting excitation parameters. Moreover, we further modify the prediction process of the excitation parameters to improve its prediction accuracy and reduce adverse effects caused by unvoiced/voiced prediction errors. The experimental results demonstrate the proposed method yields significant improvements in naturalness compared with EL speech while keeping intelligibility high enough.
机译:本文提出了一种电喉(EL)语音增强方法,该方法能够显着提高EL语音的自然度,同时不会降低其清晰度。电喉是一种外部设备,可以人工产生激励声音,使喉头切除术能够产生EL语音。尽管熟练的喉头切除术可以产生非常清晰的EL语音,但是由于该设备产生的机械激励,听起来非常不自然。此外,设备产生的激励声音经常泄漏到外部,从而增加了EL语音的噪音。为了解决这些问题,主要有两种通过降噪或统计语音转换(VC)来增强EL语音的常规方法。前一种方法通常不会导致清晰度下降,但是由于机械激励声音基本上保持不变,因此自然度只会产生很小的改善。另一方面,后一种方法使用从EL语音的声学参数转换而来的自然声音的频谱和激励参数来显着提高EL语音的自然性,但是由于转换错误,通常会导致清晰度下降。我们提出一种使用降噪方法增强频谱参数和使用统计语音转换方法预测激励参数的混合方法。此外,我们进一步修改了励磁参数的预测过程,以提高其预测精度,并减少由未发声/发声的预测误差引起的不利影响。实验结果表明,所提出的方法与EL语音相比在自然度上有显着提高,同时保持了足够高的清晰度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号