首页> 外文会议>2018 IEEE Spoken Language Technology Workshop >Scaling and Bias Codes for Modeling Speaker-Adaptive DNN-Based Speech Synthesis Systems
【24h】

Scaling and Bias Codes for Modeling Speaker-Adaptive DNN-Based Speech Synthesis Systems

机译:用于基于说话人自适应DNN的语音合成系统建模的Scaling和Bias代码

获取原文
获取原文并翻译 | 示例

摘要

Most neural-network based speaker-adaptive acoustic models for speech synthesis can be categorized into either layer-based or input-code approaches. Although both approaches have their own pros and cons, most existing works on speaker adaptation focus on improving one or the other. In this paper, after we first systematically overview the common principles of neural-network based speaker-adaptive models, we show that these approaches can be represented in a unified framework and can be generalized further. More specifically, we introduce the use of scaling and bias codes as generalized means for speaker-adaptive transformation. By utilizing these codes, we can create a more efficient factorized speaker-adaptive model and capture advantages of both approaches while reducing their disadvantages. The experiments show that the proposed method can improve the performance of speaker adaptation compared with speaker adaptation based on the conventional input code.
机译:大多数用于语音合成的基于神经网络的说话者自适应声学模型可以分类为基于层的方法或输入代码的方法。尽管这两种方法都有其优点和缺点,但大多数现有的说话人适应性研究都侧重于彼此改进。在本文中,我们首先系统地概述了基于神经网络的说话人自适应模型的通用原理,然后证明了这些方法可以在统一的框架中表示,并且可以进一步推广。更具体地说,我们介绍了使用缩放和偏置码作为说话人自适应转换的通用方法。通过使用这些代码,我们可以创建更有效的因式分解的说话人自适应模型,并在减小它们的缺点的同时,捕捉这两种方法的优点。实验表明,与基于常规输入代码的说话人自适应相比,该方法可以提高说话人自适应性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号