首页> 外国专利> SYNTHESIS OF SPEECH FROM TEXT IN A VOICE OF A TARGET SPEAKER USING NEURAL NETWORKS

SYNTHESIS OF SPEECH FROM TEXT IN A VOICE OF A TARGET SPEAKER USING NEURAL NETWORKS

机译：使用神经网络从目标说话者的语音中合成语音

页面导航

摘要
著录项
相似文献

摘要

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech synthesis. The methods, systems, and apparatus include actions of obtaining an audio representation of speech of a target speaker, obtaining input text for which speech is to be synthesized in a voice of the target speaker, generating a speaker vector by providing the audio representation to a speaker encoder engine that is trained to distinguish speakers from one another, generating an audio representation of the input text spoken in the voice of the target speaker by providing the input text and the speaker vector to a spectrogram generation engine that is trained using voices of reference speakers to generate audio representations, and providing the audio representation of the input text spoken in the voice of the target speaker for output.

机译：用于语音合成的方法，系统和装置，包括在计算机存储介质上编码的计算机程序。所述方法，系统和装置包括以下动作：获得目标说话者的语音的音频表示;获得要在目标说话者的语音中为其合成语音的输入文本;通过将音频表示提供给扬声器来生成说话者矢量。扬声器编码器引擎，经过训练可以将扬声器彼此区分开，通过将输入文本和扬声器矢量提供给使用参考语音进行训练的声谱图生成引擎，可以生成目标扬声器语音中说出的输入文本的音频表示扬声器以生成音频表示，并提供目标扬声器的语音中说出的输入文本的音频表示以进行输出。

著录项

公开/公告号WO2019222591A1

专利类型
公开/公告日2019-11-21

原文格式PDF
申请/专利权人 GOOGLE LLC;
展开▼

申请/专利号WO2019US32815
发明设计人 JIA YE;WANG QUAN;NGUYEN PATRICK AN PHU;CHEN ZHIFENG;WU YONGHUI;SHEN JONATHAN;PANG RUOMING;WEISS RON J.;MORENO IGNACIO LOPEZ;REN FEI;ZHANG YU;
展开▼

申请日2019-05-17
分类号G10L13/033;G10L13/04;G10L25/30;
国家 WO
入库时间 2022-08-21 11:14:41

相似文献

专利
外文文献
中文文献