首页> 外文OA文献 >Roles of the Average Voice in Speaker-adaptive HMM-based Speech Synthesis
【2h】

Roles of the Average Voice in Speaker-adaptive HMM-based Speech Synthesis

机译:平均语音在基于说话者的HMM语音合成中的作用

摘要

In speaker-adaptive HMM-based speech synthesis, there are typically a few speakers for which the output synthetic speech sounds worse than that of other speakers, despite having the same amount of adaptation data from within the same corpus. This paper investigates these fluctuations in quality and concludes that as melcepstral distance from the average voice becomes larger, the MOS naturalness scores generally become worse. Although this negative correlation is not that strong, it suggests a way to improve the training and adaptation strategies. We also draw comparisons between our findings and the work of other researchers regarding ``vocal attractiveness.''
机译:在基于说话者的基于HMM的语音合成中,尽管来自相同语料库的适应数据量相同,但通常会有少数说话者的输出合成语音听起来比其他说话者差。本文对这些质量波动进行了调查,并得出结论,随着与平均语音的近中耳距离变大,MOS自然分数通常会变差。尽管这种负相关性不那么强,但它提出了一种改进训练和适应策略的方法。我们还对我们的发现与其他研究人员在``声音吸引力''方面的工作进行了比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号