首页> 外文期刊>Journal of information science and engineering >Frequency Warping for Speaker Adaptation in HMM-based Speech Synthesis
【24h】

Frequency Warping for Speaker Adaptation in HMM-based Speech Synthesis

机译:基于HMM的语音合成中的说话人自适应频率弯曲

获取原文
获取原文并翻译 | 示例
           

摘要

Speaker adaptation in speech synthesis transforms a source utterance to a target utterance that differs from the source in terms of voice characteristics. In this paper, we employ vocal tract length normalization, which is generally used in speech recognition to remove individual speaker characteristics, to speaker adaptation in speech synthesis. We propose a frequency warping approach based on a time-varying bilinear function to reduce the weighted spectral distance between the source speaker and the target speaker. The warped spectra of the source speaker are then converted to line spectrum pairs to train hidden Markov models (HMM). HMMs are further adapted by algorithms based on maximum likelihood linear regression with the target speaker's data. The experimental results show that our frequency warping approach can make the warped spectra of the source speaker closer to the target speaker, and the resultant adapted HMMs perform better than the HMMs trained by unwrapped spectra in terms of synthesized speech naturalness and speaker similarity.
机译:语音合成中的说话人适应将源话语转换为目标话语,该话语在语音特性方面不同于源。在本文中,我们采用声道长度归一化(通常用于语音识别中以消除单个说话者特征)来适应语音合成中的说话者。我们提出一种基于时变双线性函数的频率扭曲方法,以减少源说话者和目标说话者之间的加权频谱距离。然后,将源扬声器的扭曲频谱转换为线频谱对,以训练隐藏的马尔可夫模型(HMM)。通过基于最大似然线性回归和目标说话者数据的算法对HMM进行调整。实验结果表明,我们的频率扭曲方法可以使源说话者的扭曲频谱更接近目标说话者,并且在合成语音自然性和说话者相似性方面,所得到的自适应HMM的性能优于解包频谱训练的HMM。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号