首页> 外文会议>International Computer Engineering Conference >Adaptive Speech Synthesis and training using low-cost AI Computing
【24h】

Adaptive Speech Synthesis and training using low-cost AI Computing

机译:使用低成本AI计算的自适应语音合成与训练

获取原文

摘要

Computer synthesis of human speech has long been possible. It is inexpensive to build the infrastructure and computing to apply subtle inflections, vocal character, phrasing, and modulation; however, these additions come at a price - not the expense of the hardware, but the large number of constants to be set and time-varying parameters that must be dynamically controlled. Although these parameters can be set manually, it is not time efficient, and having one or even dozens of dynamic descriptors is not enough to emulate realistic speech. When the vocal representation is converted to a 2-D image, an adaptation of the spectrogram, it can be easily analyzed using a CNN (convolutional neural network). Existing low-cost tools for CNN analysis can now be applied. This structure, coupled with a fully-connected neural network, can serve as a human-interactive interface for real-time training of the AI synthesizer, emulating phrasing and even accents of the trainer. This paper details the structure of the synthesizer and the training AI, and presents some practical, functional examples exploring this technique using KERAS, LabVIEW and MATLAB.
机译:长期以来一直可能的人类演讲的计算机综合。建立基础设施和计算以应用微妙的拐点,声带,措辞和调制是便宜的;但是,这些添加到价格 - 不是硬件的费用,而是必须设置的大量常量和必须动态控制的时间变化参数。虽然可以手动设置这些参数,但是它不是时间效率,并且具有一个甚至几十个动态描述符来说是不足以模拟现实语音。当声音表示被转换为2-D图像时,使用CNN(卷积神经网络)可以容易地分析频谱图的调整。现在可以应用现有的CNN分析的低成本工具。与完全连接的神经网络相耦合的这种结构可以作为人类交互式界面,用于AI合成器的实时训练,仿真训练师的措辞甚至甚至的口音。本文详细介绍了合成器和培训AI的结构,并呈现了一些使用Keras,LabVIEW和MATLAB的技术的实用,功能示例。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号