DBN-based Spectral Feature Representation for Statistical Parametric Speech Synthesis

Ya-Jun Hu; Zhen-Hua Ling

首页> 外文期刊>IEEE signal processing letters >DBN-based Spectral Feature Representation for Statistical Parametric Speech Synthesis

【24h】

DBN-based Spectral Feature Representation for Statistical Parametric Speech Synthesis

机译：统计参数语音合成的基于DBN的频谱特征表示

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This letter presents a method of deriving spectral features using a deep belief network (DBN) for hidden Markov model (HMM)-based parametric speech synthesis. At training time, a DBN is estimated to represent the high-dimensional spectral envelopes and then transforms them into binary codes. These DBN-based binary codes (DBCs) are used as spectral features for HMM modeling. At synthesis time, spectral envelopes are recovered from the predicted DBC sequences and then used for waveform reconstruction. Experimental results show that our proposed method can achieve better naturalness than the conventional method using mel-cepstra as spectral features and considering global variance (GV) during parameter generation.

机译：这封信提出了一种使用深度置信网络（DBN）导出频谱特征的方法，用于基于隐马尔可夫模型（HMM）的参量语音合成。在训练时，估计一个DBN代表高维频谱包络，然后将它们转换为二进制代码。这些基于DBN的二进制代码（DBC）用作HMM建模的频谱特征。在合成时，从预测的DBC序列中恢复频谱包络，然后将其用于波形重建。实验结果表明，与以mel-cepstra为谱特征并在参数生成过程中考虑全局方差（GV）的常规方法相比，我们提出的方法具有更好的自然性。

著录项

来源
《IEEE signal processing letters》 |2016年第3期|321-325|共5页
作者
Ya-Jun Hu; Zhen-Hua Ling;
展开▼
作者单位

National Engineering Laboratory for Speech and Language Information Processing, University of Science and Technology of China, Hefei, China;

National Engineering Laboratory for Speech and Language Information Processing, University of Science and Technology of China, Hefei, China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Hidden Markov models; Speech synthesis; Feature extraction; Binary codes; Training; Speech; Acoustics;

机译：隐马尔可夫模型语音合成特征提取二进制码训练语音声学;

相似文献

外文文献
中文文献
专利

1. On the impact of excitation and spectral parameters for expressive statistical parametric speech synthesis [J] . Ranniery Maia, Masami Akamine Computer speech and language . 2014,第5期

机译：激励和频谱参数对表达统计参数语音合成的影响
2. Modeling Spectral Envelopes Using Restricted Boltzmann Machines and Deep Belief Networks for Statistical Parametric Speech Synthesis [J] . Ling, Z.-H., Deng, Audio, Speech, and Language Processing, IEEE Transactions on . 2013,第10期

机译：使用受限Boltzmann机和深度置信网络对频谱包络建模以进行统计参数语音合成
3. Excitation modelling using epoch features for statistical parametric speech synthesis [J] . M Kiran Reddy, K Sreenivasa Rao Computer speech and language . 2020,第Mara期

机译：使用纪元特征进行激励建模以进行统计参数语音合成
4. Multi-stream spectral representation for statistical parametric speech synthesis [C] . Kayoko Yanagisawa, Ranniery Maia, Yannis Stylianou IEEE International Conference on Acoustics, Speech and Signal Processing . 2016

机译：用于统计参量语音合成的多流频谱表示
5. Statistical Parametric Speech Synthesis using Deep Learning Architectures [D] . Kang, Shiyin. 2016

机译：使用深度学习架构的统计参数致辞
6. Discriminative Multi-Stream Postfilters Based on Deep Learning for Enhancing Statistical Parametric Speech Synthesis [O] . Marvin Coto-Jiménez 2021

机译：基于深度学习的判别多流破旧用于增强统计参数致辞综合
7. A deep auto-encoder based low-dimensional feature extraction from FFT spectral envelopes for statistical parametric speech synthesis [O] . Takaki, Shinji, Yamagishi, Junichi 2016

机译：基于深度自动编码器的FFT谱包络的低维特征提取，用于统计参数语音合成

DBN-based Spectral Feature Representation for Statistical Parametric Speech Synthesis

摘要

著录项

相似文献

相关主题

期刊订阅