首页> 外国专利> Neural network generative modeling to transform speech utterances and augment training data

Neural network generative modeling to transform speech utterances and augment training data

机译：神经网络生成建模转换语音词语和增强训练数据

页面导航

摘要
著录项
相似文献

摘要

Systems, methods, and devices for speech transformation and generating synthetic speech using deep generative models are disclosed. A method of the disclosure includes receiving input audio data comprising a plurality of iterations of a speech utterance from a plurality of speakers. The method includes generating an input spectrogram based on the input audio data and transmitting the input spectrogram to a neural network configured to generate an output spectrogram. The method includes receiving the output spectrogram from the neural network and, based on the output spectrogram, generating synthetic audio data comprising the speech utterance.

机译：公开了用于语音变换和使用深生成模型产生合成语音的系统，方法和设备。本公开的方法包括接收包括从多个扬声器的语音发声的多个迭代的输入音频数据。该方法包括基于输入音频数据生成输入频谱图，并将输入频谱图发送到被配置为生成输出频谱图的神经网络。该方法包括从神经网络接收输出频谱图，并且基于输出频谱图，产生包括语音话语的合成音频数据。

著录项

公开/公告号US10937438B2

专利类型
公开/公告日2021-03-02

原文格式PDF
申请/专利权人 FORD GLOBAL TECHNOLOGIES LLC;
展开▼

申请/专利号US201815940639
发明设计人 PRAVEEN NARAYANAN;LISA SCARIA;FRANCOIS CHARETTE;ASHLEY ELIZABETH MICKS;RYAN BURKE;
展开▼

申请日2018-03-29
分类号G10L21/02;G10L15/16;G10L25/03;G10L15/06;G06F3/16;G06N5/04;
国家 US
入库时间 2022-08-24 17:26:04

相似文献

专利
外文文献
中文文献