首页> 外文会议>Annual conference of the International Speech Communication Association >Psychoacoustic Segment Scoring for Multi-Form Speech Synthesis
【24h】

Psychoacoustic Segment Scoring for Multi-Form Speech Synthesis

机译:多形式语音合成的心理声学片段计分

获取原文

摘要

In multi-form segment synthesis, output speech is constructed by splicing waveform segments with statistically modeled and regenerated parametric speech segments. The fraction of model-derived segments is called model-template ratio. The motivation of this work is to further increase flexibility of multi-form synthesis maintaining high speech quality for high model-template ratios. An approach is presented where the representation type of a segment is selected per acoustic leaf. We introduce a novel method for leaf representation selection based on a psychoacoustic segment stationarity score. Additionally, refinements in multi-form segment concatenation including boundary constrained statistical parametric synthesis and time-domain alignment based on multi-peak analysis of cross-correlation for high model-template ratio multi-form synthesis are presented.
机译:在多形式片段合成中,通过将波形片段与统计建模和再生的参数语音片段进行拼接来构建输出语音。来自模型的片段的分数称为模型-模板比率。这项工作的目的是进一步提高多形式合成的灵活性,从而在高模型模板比率的情况下保持高语音质量。提出了一种方法,其中每个声叶选择片段的表示类型。我们介绍了一种基于心理声学段平稳性得分的叶子表示选择的新方法。另外,提出了对多形式段级联的改进,包括边界约束统计参数合成和基于多峰互相关分析的时域对齐,以实现高模型模板比的多形式合成。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号