首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Multi-stream spectral representation for statistical parametric speech synthesis
【24h】

Multi-stream spectral representation for statistical parametric speech synthesis

机译:用于统计参量语音合成的多流频谱表示

获取原文

摘要

In statistical parametric speech synthesis such as Hidden Markov Model (HMM) based synthesis, one of the problems is in the over-smoothing of parameters, which leads to a muffled sensation in the synthesised output. In this paper, we propose an approach in which the high frequency spectrum is modelled separately from the low frequency spectrum. The high frequency band, which does not carry much linguistic information, is clustered using a very large decision tree so as to generate parameters as close as possible to natural speech samples. The boundary frequency can be adjusted at synthesis time for each state. Subjective listening tests show that the proposed approach is significantly preferred over the conventional approach using a single spectrum stream. Samples synthesised using the proposed approach sound less muffled and more natural.
机译:在基于基于隐马尔可夫模型(HMM)的统计参量语音合成中,问题之一在于参数的过度平滑,这会导致合成输出中的声音减弱。在本文中,我们提出了一种将高频频谱与低频频谱分开建模的方法。不携带太多语言信息的高频段使用非常大的决策树进行聚类,以便生成尽可能接近自然语音样本的参数。边界频率可以在合成时针对每个状态进行调整。主观听觉测试表明,与使用单个频谱流的常规方法相比,该方法明显更受欢迎。使用建议的方法合成的样本听起来不那么沉闷,更自然。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号