Scaling and Bias Codes for Modeling Speaker-Adaptive DNN-Based Speech Synthesis Systems

机译：用于基于说话人自适应DNN的语音合成系统建模的Scaling和Bias代码

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Most neural-network based speaker-adaptive acoustic models for speech synthesis can be categorized into either layer-based or input-code approaches. Although both approaches have their own pros and cons, most existing works on speaker adaptation focus on improving one or the other. In this paper, after we first systematically overview the common principles of neural-network based speaker-adaptive models, we show that these approaches can be represented in a unified framework and can be generalized further. More specifically, we introduce the use of scaling and bias codes as generalized means for speaker-adaptive transformation. By utilizing these codes, we can create a more efficient factorized speaker-adaptive model and capture advantages of both approaches while reducing their disadvantages. The experiments show that the proposed method can improve the performance of speaker adaptation compared with speaker adaptation based on the conventional input code.

机译：大多数用于语音合成的基于神经网络的说话者自适应声学模型可以分类为基于层的方法或输入代码的方法。尽管这两种方法都有其优点和缺点，但大多数现有的说话人适应性研究都侧重于彼此改进。在本文中，我们首先系统地概述了基于神经网络的说话人自适应模型的通用原理，然后证明了这些方法可以在统一的框架中表示，并且可以进一步推广。更具体地说，我们介绍了使用缩放和偏置码作为说话人自适应转换的通用方法。通过使用这些代码，我们可以创建更有效的因式分解的说话人自适应模型，并在减小它们的缺点的同时，捕捉这两种方法的优点。实验表明，与基于常规输入代码的说话人自适应相比，该方法可以提高说话人自适应性能。

著录项

来源
《2018 IEEE Spoken Language Technology Workshop》|2018年|610-617|共8页
会议地点 Athens(GR)
作者
Hieu-Thi Luong; Junichi Yamagishi;
展开▼
作者单位

National Institute of Informatics, Tokyo, Japan;

National Institute of Informatics, Tokyo, Japan;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Adaptation models; Speech synthesis; Matrix decomposition; Neural networks; Acoustics; Hidden Markov models; Data models;

机译：适应模型语音合成矩阵分解神经网络声学隐马尔可夫模型数据模型;

相似文献

外文文献
中文文献
专利

1. DNN-Based Speech Synthesis Using Speaker Codes [J] . Nobukatsu HOJO, Yusuke IJIMA, Hideyuki MIZUNO IEICE transactions on information and systems . 2018,第2期

机译：使用说话者代码的基于DNN的语音合成
2. Improving Trajectory Modelling for DNN-Based Speech Synthesis by Using Stacked Bottleneck Features and Minimum Generation Error Training [J] . Zhizheng Wu, Simon King Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2016,第7期

机译：通过使用堆叠的瓶颈特征和最小生成误差训练来改进基于DNN的语音合成的轨迹模型
3. Speech production model and its application to speech synthesis system -speech synthesis by mimicing human speech production- [J] . Masaaki Honda NTT R&D . 1998,第4期

机译：语音产生模型及其在语音合成系统中的应用-模拟人类语音产生的语音合成-
4. Scaling and Bias Codes for Modeling Speaker-Adaptive DNN-Based Speech Synthesis Systems [C] . Hieu-Thi Luong, Junichi Yamagishi Spoken Language Technology Workshop . 2018

机译：基于扬声器 - 自适应DNN的语音合成系统的缩放和偏置码
5. Hierarchical modeling and robust synthesis for the preliminary design of large scale complex systems. [D] . Koch, Patrick Nathan. 1998

机译：大规模复杂系统的初步设计的层次建模和鲁棒综合。
6. The genetic code can cause systematic bias in simple phylogenetic models [O] . Simon Whelan 2008

机译：遗传密码可以在简单的系统发育模型中引起系统偏见
7. Measuring the contribution to cognitive load of each predicted vocoder speech parameter in DNN-based speech synthesis [O] . Avashna Govender, Cassia Valentini-Botinhao, Simon King 2019

机译：在基于DNN的语音合成中测量对每个预测的声码器语音参数的认知负荷的贡献
8. SELECTED METHODS FOR IMPROVING SYNTHESIS SPEECH QUALITY USING LINEAR PREDICTIVE CODING:SYSTEM DESCRIPTION, COEFFICIENT SMOOTHING AND STREAK [R] . Steven Frank Boll 1974

机译：使用线性预测编码提高合成语音质量的选择方法：系统描述，系数平滑和节拍

Scaling and Bias Codes for Modeling Speaker-Adaptive DNN-Based Speech Synthesis Systems

摘要

著录项

相似文献

相关主题

期刊订阅