首页> 外文会议>2010 IEEE Region 10 Conference >An improved method for predicting fundamental frequency contour in mandarin text-to-speech system with a small corpus
【24h】

An improved method for predicting fundamental frequency contour in mandarin text-to-speech system with a small corpus

机译:小语料普通话语音系统中基本频率轮廓的改进预测方法

获取原文

摘要

In this paper, a method to predict fundamental frequency contour is proposed for mandarin text-to-speech system with a small corpus. Above all, in order to avoid large modification to the speech clips, two kinds of corpus, tonal syllable corpus and high-frequency word corpus, are established. Afterwards, we apply two rules to predict the pitch contour of speech. Firstly, traditional Fujisaki model is modified to be fit in with our small corpus. Secondly, pitch jitter is simulated in a mode based on GMM. According to the fundamental frequency contour predicted by modified Fujisaki model and jitter model, the pitch of speech clips are adjusted by PSOLA algorithm, which can improve the prosody of synthesized speech to make it sound more natural. The method is effective for mandarin text-to-speech system based on a small corpus which is demonstrated by our experiments.
机译:本文提出了一种基于小语料的普通话转语音系统的基本频率轮廓预测方法。首先,为了避免对语音片段进行大的修改,建立了两种语料库,即音调音节语料库和高频词语料库。然后,我们应用两个规则来预测语音的音高轮廓。首先,对传统的Fujisaki模型进行了修改,以适合我们的小型语料库。其次,在基于GMM的模式下模拟音调抖动。根据改进的Fujisaki模型和抖动模型预测的基频轮廓,通过PSOLA算法调整语音片段的音调,可以改善合成语音的韵律,使其听起来更自然。实验证明,该方法对基于小语料的普通话语音合成系统是有效的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号