首页> 外文期刊>International journal of speech technology >A Text-to-Speech Platform for Variable Length Optimal Unit Searching Using Perception Based Cost Functions
【24h】

A Text-to-Speech Platform for Variable Length Optimal Unit Searching Using Perception Based Cost Functions

机译:基于感知的成本函数的变长最佳单位搜索的文本语音平台

获取原文
获取原文并翻译 | 示例
           

摘要

In concatenative Text-to-Speech, the size of the speech corpus is closely related to synthetic speech quality. In this paper, we describe our work on a new corpus-based Bell Labs' TTS system. This encompasses large acoustic inventories with a rich set of annotations, models and data structures for representing and managing such inventories, and an optimal unit selection algorithm that accommodates a broad range of possible cost criteria. We also propose a new method for setting weights in the cost functions based on a perceptual preference test. Our results show that this approach can successfully predict human preference patterns. Synthetic speech using weights determined in this manner consistently demonstrates smoother transitions and higher voice quality than speech using manually set weights.
机译:在串联文本到语音中,语音语料库的大小与合成语音质量密切相关。在本文中,我们描述了基于新语料库的Bell Labs TTS系统的工作。这包括具有丰富注释,模型和数据结构的大量声学清单,用于表示和管理此类清单,以及可适应各种可能成本标准的最佳单位选择算法。我们还提出了一种基于感知偏好测试在成本函数中设置权重的新方法。我们的结果表明,这种方法可以成功预测人的偏好模式。与使用手动设置的权重的语音相比,使用以这种方式确定的权重的合成语音始终显示出更平滑的过渡和更高的语音质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号