首页> 中文期刊>国际计算机前沿大会会议论文集 >Towards Realizing Mandarin-Tibetan Bi-lingual Emotional Speech Synthesis with Mandarin Emotional Training Corpus

Towards Realizing Mandarin-Tibetan Bi-lingual Emotional Speech Synthesis with Mandarin Emotional Training Corpus

     

摘要

This paper presents a method of hidden Markov model (HMM)-based Mandarin-Tibetan bi-lingual emotional speech synthesis by speaker adaptive training with a Mandarin emotional speech corpus.A one-speaker Tibetan neutral speech corpus, a multi-speaker Mandarin neutral speech corpus and a multi-speaker Mandarin emotional speech corpus are firstly employed to train a set of mixed language average acoustic models of target emotion by using speaker adaptive training.Then a one-speaker Mandarin neutral speech corpus or a one-speaker Tibetan neutral speech corpus is adopted to obtain a set of speaker dependent acoustic models of target emotion by using the speaker adap-tation transformation. The Mandarin emotional speech or the Tibetan emotional speech is finally synthesized from Mandarin speaker depen-dent acoustic models of target emotion or Tibetan speaker dependent acoustic models of target emotion. Subjective tests show that the aver-age emotional mean opinion score is 4.14 for Tibetan and 4.26 for Mandarin. The average mean opinion score is 4.16 for Tibetan and 4.28 for Mandarin. The average degradation opinion score is 4.28 for Tibetan and 4.24 for Mandarin. Therefore, the proposed method can synthesize both Tibetan speech and Mandarin speech with high naturalness and emotional expression by using only Mandarin emotional training speech corpus.

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号