首页> 外国专利> Neural network generative modeling to transform speech utterances and augment training data

Neural network generative modeling to transform speech utterances and augment training data

机译:神经网络生成建模转换语音词语和增强训练数据

摘要

Systems, methods, and devices for speech transformation and generating synthetic speech using deep generative models are disclosed. A method of the disclosure includes receiving input audio data comprising a plurality of iterations of a speech utterance from a plurality of speakers. The method includes generating an input spectrogram based on the input audio data and transmitting the input spectrogram to a neural network configured to generate an output spectrogram. The method includes receiving the output spectrogram from the neural network and, based on the output spectrogram, generating synthetic audio data comprising the speech utterance.
机译:公开了用于语音变换和使用深生成模型产生合成语音的系统,方法和设备。本公开的方法包括接收包括从多个扬声器的语音发声的多个迭代的输入音频数据。该方法包括基于输入音频数据生成输入频谱图,并将输入频谱图发送到被配置为生成输出频谱图的神经网络。该方法包括从神经网络接收输出频谱图,并且基于输出频谱图,产生包括语音话语的合成音频数据。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号