Speech Enhancement for a Noise-Robust Text-to-Speech Synthesis System using Deep Recurrent Neural Networks

机译：使用深频神经网络的噪声强大的文本与语音合成系统的语音增强

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Quality of text-to-speech voices built from noisy recordings is diminished. In order to improve it we propose the use of a recurrent neural network to enhance acoustic parameters prior to training. We trained a deep recurrent neural network using a parallel database of noisy and clean acoustics parameters as input and output of the network. The database consisted of multiple speakers and diverse noise conditions. We investigated using text-derived features as an additional input of the network. We processed a noisy database of two other speakers using this network and used its output to train an HMM acoustic text-to-synthesis model for each voice. Listening experiment results showed that the voice built with enhanced parameters was ranked significantly higher than the ones trained with noisy speech and speech that has been enhanced using a conventional enhancement system. The text-derived features improved results only for the female voice, where it was ranked as highly as a voice trained with clean speech.

机译：从嘈杂录音中建造的文本语音声音的质量减少。为了改善它，我们提出了经常性神经网络在训练前增强声学参数。我们使用并行数据库训练了一个深度经常性的神经网络，并将声学参数的并行数据库作为网络的输入和输出。数据库包括多个扬声器和不同的噪声条件。我们使用文本派生功能作为网络的附加输入进行了调查。我们使用此网络处理了另外两个扬声器的嘈杂数据库，并使用其输出来为每个语音训练HMM声学文本到合成模型。聆听实验结果表明，通过增强参数构建的语音被排名得明显高于使用传统增强系统进行增强的噪音和语音训练的音乐。文本派生的功能仅为女性语音改进了结果，在那里它被排名第一，作为用干净的语音训练的声音。

著录项

来源
《Annual Conference of the International Speech Communication Association》|2016年|744p|共5页
会议地点
作者
Cassia Valentini-Botinhao; Xin Wang; Shinji Takaki; Junichi Yamagishi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TB95-53;
关键词

相似文献

外文文献
中文文献
专利

1. A Spectral Masking Approach to Noise-Robust Speech Recognition Using Deep Neural Networks [J] . Li B., Sim K.C. Audio, Speech, and Language Processing, IEEE Transactions on . 2014,第8期

机译：深度神经网络的语音鲁棒语音识别频谱掩蔽方法
2. Prosody modeling for syllable based text-to-speech synthesis using feedforward neural networks [J] . Reddy V. Ramu, Rao K. Sreenivasa Neurocomputing . 2016,第JANa1期

机译：使用前馈神经网络进行基于音节的语音合成的韵律建模
3. Two-stage intonation modeling using feedforward neural networks for syllable based text-to-speech synthesis [J] . V. Ramu Reddy, K. Sreenivasa Rao Computer speech and language . 2013,第5期

机译：使用前馈神经网络的两阶段音调建模，用于基于音节的文本到语音合成
4. Speech Enhancement for a Noise-Robust Text-to-Speech Synthesis System using Deep Recurrent Neural Networks [C] . Cassia Valentini-Botinhao, Xin Wang, Shinji Takaki, Annual Conference of the International Speech Communication Association . 2016

机译：使用深频神经网络的噪声强大的文本与语音合成系统的语音增强
5. Engineering Recurrent Neural Networks for Low-Rank and Noise-Robust Computation [D] . Stock, Christopher Hopkins. 2021

机译：用于低级和噪声稳健计算的工程经常性神经网络
6. Evaluation of Mixed Deep Neural Networks for Reverberant Speech Enhancement [O] . Michelle Gutiérrez-Muñoz, Astryd González-Salazar, Marvin Coto-Jiménez 2020

机译：混合深度神经网络对回响语音增强的评估
7. Speech Enhancement for a Noise-Robust Text-to-Speech Synthesis System using Deep Recurrent Neural Networks [O] . Valentini Botinhao, Cassia, Wang, Xin, Takaki, Shinji, 2016

机译：使用深度递归神经网络的噪声鲁棒文本到语音合成系统的语音增强
8. Text-To-Speech Phrasing Enhancement System Using Neural Networks [R] . Julig, L. F. 1995

机译：基于神经网络的文本语音语音增强系统

Speech Enhancement for a Noise-Robust Text-to-Speech Synthesis System using Deep Recurrent Neural Networks

摘要

著录项

相似文献

相关主题

期刊订阅