Adaptive Speech Synthesis and training using low-cost AI Computing

机译：使用低成本AI计算的自适应语音合成和训练

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Computer synthesis of human speech has long been possible. It is inexpensive to build the infrastructure and computing to apply subtle inflections, vocal character, phrasing, and modulation; however, these additions come at a price - not the expense of the hardware, but the large number of constants to be set and time-varying parameters that must be dynamically controlled. Although these parameters can be set manually, it is not time efficient, and having one or even dozens of dynamic descriptors is not enough to emulate realistic speech. When the vocal representation is converted to a 2-D image, an adaptation of the spectrogram, it can be easily analyzed using a CNN (convolutional neural network). Existing low-cost tools for CNN analysis can now be applied. This structure, coupled with a fully-connected neural network, can serve as a human-interactive interface for real-time training of the AI synthesizer, emulating phrasing and even accents of the trainer. This paper details the structure of the synthesizer and the training AI, and presents some practical, functional examples exploring this technique using KERAS, LabVIEW and MATLAB.

机译：长期以来，人类语音的计算机合成已成为可能。构建基础结构和计算以应用微妙的音调，人声特征，短语和调制是很便宜的;但是，这些增加的代价是-不付出硬件的代价，而是要设置的大量常量和必须动态控制的时变参数。尽管可以手动设置这些参数，但时间效率不高，并且具有一个甚至几十个动态描述符不足以模拟逼真的语音。当声音表示形式转换为二维图像时，可以使用CNN（卷积神经网络）轻松地对其进行分析。现在可以应用现有的CNN分析低成本工具。这种结构，加上完全连接的神经网络，可以用作人机交互界面，用于AI合成器的实时培训，模拟教练的措辞甚至口音。本文详细介绍了合成器和训练AI的结构，并提供了一些实用的功能示例，这些示例使用KERAS，LabVIEW和MATLAB对该技术进行了探索。

著录项

来源
《International Computer Engineering Conference》|2018年|32-35|共4页
会议地点
作者
Frank Raffaeli; Selim Awad;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Training; Synthesizers; Artificial intelligence; Neural networks; Speech synthesis; Convergence; Spectrogram;

机译：培训;合成器;人工智能;神经网络;语音合成;收敛;频谱图;

相似文献

外文文献
中文文献
专利

1. Using speaker adaptive training to realize Mandarin-Tibetan cross-lingual speech synthesis [J] . Yang Hongwu, Oura Keiichiro, Wang Haiyan, Multimedia Tools and Applications . 2015,第22期

机译：利用说话者自适应训练来实现汉语-藏语跨语言语音合成
2. Average-Voice-Based Speech Synthesis Using HSMM-Based Speaker Adaptation and Adaptive Training [J] . Junichi YAMAGISHI, Takao KOBAYASHI IEICE Transactions on Information and Systems . 2007,第2期

机译：基于HSMM的说话人自适应和自适应训练的基于平均语音的语音合成
3. Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System [J] . Dhanalakshmi M., Celin T. A. Mariya, Nagarajan T., Circuits, systems, and signal processing . 2018,第2期

机译：基于HMM的语音识别和自适应合成系统的韵律演讲者的语音输入语音输出通信
4. Adaptive Speech Synthesis and training using low-cost AI Computing [C] . Frank Raffaeli, Selim Awad International Computer Engineering Conference . 2018

机译：使用低成本AI计算的自适应语音合成与训练
5. DIPHONE SPEECH SYNTHESIS BASED ON A PITCH-ADAPTIVE SHORT-TIME FOURIER TRANSFORM [D] . GLINSKI, STEPHEN CHARLES 1981

机译：基于音调的短时傅立叶变换的双语音合成
6. Improving Robustness of Deep Neural Network Acoustic Models via Speech Separation and Joint Adaptive Training [O] . Arun Narayanan, DeLiang Wang -1

机译：通过语音分离和联合自适应训练提高深度神经网络声学模型的鲁棒性
7. HMM-Based Distributed Text-to-Speech Synthesis Incorporating Speaker-Adaptive Training [O] . Kwang Myung Jeon, Seung Ho Choi 2015

机译：基于Hmm的分布式文本到语音合成结合扬声器自适应训练

Adaptive Speech Synthesis and training using low-cost AI Computing

摘要

著录项

相似文献

相关主题

期刊订阅