An investigation of multi-speaker training for wavenet vocoder

机译：波网声码器多说话者训练研究

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we investigate the effectiveness of multi-speaker training for WaveNet vocoder. In our previous work, we have demonstrated that our proposed speaker-dependent (SD) WaveNet vocoder, which is trained with a single speaker's speech data, is capable of modeling temporal waveform structure, such as phase information, and makes it possible to generate more naturally sounding synthetic voices compared to conventional high-quality vocoder, STRAIGHT. However, it is still difficult to generate synthetic voices of various speakers using the SD-WaveNet due to its speaker-dependent property. Towards the development of speaker-independent WaveNet vocoder, we apply multi-speaker training techniques to the WaveNet vocoder and investigate its effectiveness. The experimental results demonstrate that 1) the multispeaker WaveNet vocoder still outperforms STRAIGHT in generating known speakers' voices but it is comparable to STRAIGHT in generating unknown speakers' voices, and 2) the multi-speaker training is effective for developing the WaveNet vocoder capable of speech modification.

机译：在本文中，我们研究了WaveNet声码器多扬声器培训的有效性。在我们之前的工作中，我们已经证明了我们提出的与扬声器相关的（SD）WaveNet声码器，该扬声器受单个扬声器的语音数据训练，能够对时间波形结构（例如相位信息）进行建模，并有可能产生更多的声音。与传统的高质量声码器STRAIGHT相比，听起来自然而然。但是，由于SD-WaveNet具有与扬声器相关的特性，因此仍然难以使用SD-WaveNet生成各种扬声器的合成声音。为了发展与扬声器无关的WaveNet声码器，我们将多扬声器训练技术应用于WaveNet声码器并研究其有效性。实验结果表明：1）多扬声器WaveNet声码器在生成已知扬声器的声音方面仍胜过STRAIGHT，但在生成未知扬声器的声音方面可与STRAIGHT媲美，并且2）多扬声器训练对于开发具有以下功能的WaveNet声码器是有效的：语音修改。

著录项

来源
《2017 IEEE Automatic Speech Recognition and Understanding Workshop》|2017年|712-718|共7页
会议地点 Okinawa(JP)
作者
Tomoki Hayashi; Akira Tamamori; Kazuhiro Kobayashi; Kazuya Takeda; Tomoki Toda;
展开▼
作者单位

Graduate School of Information Science, Nagoya University, Japan;

Institute of Innovation for Future Society, Nagoya University, Japan;

Information Technology Center, Nagoya University, Japan;

Graduate School of Information Science, Nagoya University, Japan;

Information Technology Center, Nagoya University, Japan;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Vocoders; Speech; Training; Acoustics; Convolution; Speech synthesis;

机译：声码器;语音;训练;声学;卷积;语音合成;;

相似文献

外文文献
中文文献
专利

1. Group Sparse Representation With WaveNet Vocoder Adaptation for Spectrum and Prosody Conversion [J] . Sisman Berrak, Zhang Mingyang, Li Haizhou Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2019,第6期

机译：WaveNet声码器自适应的组稀疏表示，用于频谱和韵律转换
2. Group Sparse Representation With WaveNet Vocoder Adaptation for Spectrum and Prosody Conversion [J] . Sisman Berrak, Zhang Mingyang, Li Haizhou Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2019,第6期

机译：Group稀疏表示与Wavenet声码器适应频谱和韵律转换
3. Do prosodic manual annotations matter for Japanese speech synthesis systems with WaveNet vocoder? [J] . Hieu-Thi LUONG, Xin WANG, Junichi YAMAGISHI, 電子情報通信学会技術研究報告. 信号処理. Signal Processing . 2017,第516期

机译：使用Wavenet Vocoder的日语语音合成系统进行韵律手册注释吗？
4. An investigation of multi-speaker training for wavenet vocoder [C] . Tomoki Hayashi, Akira Tamamori, Kazuhiro Kobayashi, IEEE Workshop on Automatic Speech Recognition and Understanding . 2017

机译：Wavenet Vocoder多扬声器训练的调查
5. The relationship between training design and trainee differences on training outcomes: An experimental investigation of the treatment of socialization and training content in the e-learning environment [D] . Yanson, Regina 2012

机译：培训设计与培训结果差异之间的关系：对电子学习环境中的社交化和培训内容的处理的实验研究
6. Clinical Investigator Training Program (CITP) – A practical and pragmatic approach to conveying clinical investigator competencies and training to busy clinicians [O] . Mansoor Saleh, Gurudatta Naik, Penelope Jester, 2020

机译：临床研究者培训计划（CITP）–向繁忙的临床医生传达临床研究者能力和培训的实用且务实的方法
7. Refined WaveNet Vocoder for Variational Autoencoder Based Voice Conversion [O] . Wen-Chin Huang, Yi-Chiao Wu, Hsin-Te Hwang, 2019

机译：基于变化的AutiaceAtiCoder的语音转换，精制Wavenet声码器
8. Study and Investigation of Microminiaturization of Vocoder Analog Circuitry. [R] . miles,robert danielson,gordon 1978

机译：声码器模拟电路微小型化的研究与探讨。

An investigation of multi-speaker training for wavenet vocoder

摘要

著录项

相似文献

相关主题

期刊订阅