首页> 外文会议>International Computer Engineering Conference >Adaptive Speech Synthesis and training using low-cost AI Computing
【24h】

Adaptive Speech Synthesis and training using low-cost AI Computing

机译:使用低成本AI计算的自适应语音合成和训练

获取原文

摘要

Computer synthesis of human speech has long been possible. It is inexpensive to build the infrastructure and computing to apply subtle inflections, vocal character, phrasing, and modulation; however, these additions come at a price - not the expense of the hardware, but the large number of constants to be set and time-varying parameters that must be dynamically controlled. Although these parameters can be set manually, it is not time efficient, and having one or even dozens of dynamic descriptors is not enough to emulate realistic speech. When the vocal representation is converted to a 2-D image, an adaptation of the spectrogram, it can be easily analyzed using a CNN (convolutional neural network). Existing low-cost tools for CNN analysis can now be applied. This structure, coupled with a fully-connected neural network, can serve as a human-interactive interface for real-time training of the AI synthesizer, emulating phrasing and even accents of the trainer. This paper details the structure of the synthesizer and the training AI, and presents some practical, functional examples exploring this technique using KERAS, LabVIEW and MATLAB.
机译:长期以来,人类语音的计算机合成已成为可能。构建基础结构和计算以应用微妙的音调,人声特征,短语和调制是很便宜的;但是,这些增加的代价是-不付出硬件的代价,而是要设置的大量常量和必须动态控制的时变参数。尽管可以手动设置这些参数,但时间效率不高,并且具有一个甚至几十个动态描述符不足以模拟逼真的语音。当声音表示形式转换为二维图像时,可以使用CNN(卷积神经网络)轻松地对其进行分析。现在可以应用现有的CNN分析低成本工具。这种结构,加上完全连接的神经网络,可以用作人机交互界面,用于AI合成器的实时培训,模拟教练的措辞甚至口音。本文详细介绍了合成器和训练AI的结构,并提供了一些实用的功能示例,这些示例使用KERAS,LabVIEW和MATLAB对该技术进行了探索。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号