首页> 外文会议>International Conference on Pattern Recognition Association of South Africa and Robotics and Mechatronics >Objective measures to improve the selection of training speakers in HMM-based child speech synthesis
【24h】

Objective measures to improve the selection of training speakers in HMM-based child speech synthesis

机译:改进汉姆培育综合培训人员培训师选择的客观措施

获取原文

摘要

Building synthetic child voices is considered a difficult task due to the challenges associated with data collection. As a result, speaker adaptation in conjunction with Hidden Markov Model (HMM)-based synthesis has become prevalent in this domain because the approach caters for limited amounts of data. An initial average voice model is trained using data from multiple speakers and adapted to resemble a specific target child speaker. Due to the scarcity of child speech data, initial models used in this approach are mostly trained with adult speech data. However, selection of appropriate training speakers from large corpora is not a trivial task because there is no means, other than conducting exhaustive subjective listening tests, to determine which training speakers will yield the best quality synthetic child voice. Therefore, there is a need to find an objective measure that can be used to easily identify a small set of training speakers that will yield the best quality output. In this paper we investigate whether a relationship exists between objective and subjective voice evaluation measures with regard to the selection of training speakers for an average voice model used in speaker-adaptive HMM child speech synthesis. Results indicate that, if training speakers that are closer to the target speaker are used to train initial models, better quality child voices are generated.
机译:由于与数据收集相关的挑战,构建合成儿童声音被认为是一项艰巨的任务。结果,与隐马尔可夫模型(HMM)的合成结合的扬声器适应在该领域中普遍存在域中,因为该方法能够满足有限的数据量。使用来自多个扬声器的数据训练初始平均语音模型,并适用于类似于特定的目标儿童扬声器。由于儿童语音数据的稀缺性,这种方法中使用的初始模型主要由成人语音数据培训。然而,从大型语料库中选择适当的培训扬声器不是一个琐碎的任务,因为除了进行详尽的主观听力测试之外,没有任何方法,确定哪些培训扬声器将产生最优质的合成子声音。因此,需要找到一个客观措施,可以用于轻松识别一小组训练扬声器,这将产生最佳质量输出。在本文中,我们调查了对客观和主观语音评估措施之间的关系是否存在关于扬声器 - 自适应HMM儿童语音合成的平均语音模型的训练扬声器之间的目标和主观语音评估措施。结果表明,如果越来越靠近目标扬声器的培训扬声器用于培训初始模型,则会生成更好的素质儿童声音。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号