首页> 中文期刊> 《自动化学报:英文版》 >A HMM-based Mandarin Chinese Singing Voice Synthesis System

A HMM-based Mandarin Chinese Singing Voice Synthesis System



We propose a mandarin Chinese singing voice synthesis system, in which hidden Markov model(HMM)-based speech synthesis technique is used. A mandarin Chinese singing voice corpus is recorded and musical contextual features are well designed for training. F0 and spectrum of singing voice are simultaneously modeled with context-dependent HMMs. There is a new problem, F0 of singing voice is always sparse because of large amount of context, i.e., tempo and pitch of note, key, time signature and etc. So the features hardly ever appeared in the training data cannot be well obtained. To address this problem,difference between F0 of singing voice and that of musical score(DF0) is modeled by a single Viterbi training. To overcome the over-smoothing of the generated F0 contour, syllable level F0 model based on discrete cosine transforms(DCT) is applied, F0 contour is generated by integrating two-level statistical models.The experimental results demonstrate that the proposed system outperforms the baseline system in both objective and subjective evaluations. The proposed system can generate a more natural F0 contour. Furthermore, the syllable level F0 model can make singing voice more expressive.


  • 来源
    《自动化学报:英文版》 |2016年第002期|P.192-202|共11页
  • 作者

    Xian Li; Zengfu Wang;

  • 作者单位

    the Department of Automation, University of Science and Technology of China;

    the Institute of Intelligent Machines, Chinese Academy of Sciences;

  • 原文格式 PDF
  • 正文语种 CHI
  • 中图分类
  • 关键词



京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号