Adaptive Speech Synthesis and training using low-cost AI Computing

机译：使用低成本AI计算的自适应语音合成与训练

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Computer synthesis of human speech has long been possible. It is inexpensive to build the infrastructure and computing to apply subtle inflections, vocal character, phrasing, and modulation; however, these additions come at a price - not the expense of the hardware, but the large number of constants to be set and time-varying parameters that must be dynamically controlled. Although these parameters can be set manually, it is not time efficient, and having one or even dozens of dynamic descriptors is not enough to emulate realistic speech. When the vocal representation is converted to a 2-D image, an adaptation of the spectrogram, it can be easily analyzed using a CNN (convolutional neural network). Existing low-cost tools for CNN analysis can now be applied. This structure, coupled with a fully-connected neural network, can serve as a human-interactive interface for real-time training of the AI synthesizer, emulating phrasing and even accents of the trainer. This paper details the structure of the synthesizer and the training AI, and presents some practical, functional examples exploring this technique using KERAS, LabVIEW and MATLAB.

机译：长期以来一直可能的人类演讲的计算机综合。建立基础设施和计算以应用微妙的拐点，声带，措辞和调制是便宜的;但是，这些添加到价格 - 不是硬件的费用，而是必须设置的大量常量和必须动态控制的时间变化参数。虽然可以手动设置这些参数，但是它不是时间效率，并且具有一个甚至几十个动态描述符来说是不足以模拟现实语音。当声音表示被转换为2-D图像时，使用CNN（卷积神经网络）可以容易地分析频谱图的调整。现在可以应用现有的CNN分析的低成本工具。与完全连接的神经网络相耦合的这种结构可以作为人类交互式界面，用于AI合成器的实时训练，仿真训练师的措辞甚至甚至的口音。本文详细介绍了合成器和培训AI的结构，并呈现了一些使用Keras，LabVIEW和MATLAB的技术的实用，功能示例。

著录项

来源
《International Computer Engineering Conference》|2018年|vii 265 p. :|共4页
会议地点
作者
Frank Raffaeli; Selim Awad;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
Training; Synthesizers; Artificial intelligence; Neural networks; Speech synthesis; Convergence; Spectrogram;

机译：培训;合成器;人工智能;神经网络;语音合成;收敛;谱图;

相似文献

外文文献
中文文献
专利

1. Using speaker adaptive training to realize Mandarin-Tibetan cross-lingual speech synthesis [J] . Yang Hongwu, Oura Keiichiro, Wang Haiyan, Multimedia Tools and Applications . 2015,第22期

机译：利用说话者自适应训练来实现汉语-藏语跨语言语音合成
2. Average-Voice-Based Speech Synthesis Using HSMM-Based Speaker Adaptation and Adaptive Training [J] . Junichi YAMAGISHI, Takao KOBAYASHI IEICE Transactions on Information and Systems . 2007,第2期

机译：基于HSMM的说话人自适应和自适应训练的基于平均语音的语音合成
3. Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System [J] . Dhanalakshmi M., Celin T. A. Mariya, Nagarajan T., Circuits, systems, and signal processing . 2018,第2期

机译：基于HMM的语音识别和自适应合成系统的韵律演讲者的语音输入语音输出通信
4. Adaptive Speech Synthesis and training using low-cost AI Computing [C] . Frank Raffaeli, Selim Awad International Computer Engineering Conference . 2018

机译：使用低成本AI计算的自适应语音合成和训练
5. DIPHONE SPEECH SYNTHESIS BASED ON A PITCH-ADAPTIVE SHORT-TIME FOURIER TRANSFORM [D] . GLINSKI, STEPHEN CHARLES 1981

机译：基于音调的短时傅立叶变换的双语音合成
6. Improving Robustness of Deep Neural Network Acoustic Models via Speech Separation and Joint Adaptive Training [O] . Arun Narayanan, DeLiang Wang -1

机译：通过语音分离和联合自适应训练提高深度神经网络声学模型的鲁棒性
7. HMM-Based Distributed Text-to-Speech Synthesis Incorporating Speaker-Adaptive Training [O] . Kwang Myung Jeon, Seung Ho Choi 2015

机译：基于Hmm的分布式文本到语音合成结合扬声器自适应训练

Adaptive Speech Synthesis and training using low-cost AI Computing

摘要

著录项

相似文献

相关主题

期刊订阅