Multi-stream spectral representation for statistical parametric speech synthesis

机译：用于统计参量语音合成的多流频谱表示

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In statistical parametric speech synthesis such as Hidden Markov Model (HMM) based synthesis, one of the problems is in the over-smoothing of parameters, which leads to a muffled sensation in the synthesised output. In this paper, we propose an approach in which the high frequency spectrum is modelled separately from the low frequency spectrum. The high frequency band, which does not carry much linguistic information, is clustered using a very large decision tree so as to generate parameters as close as possible to natural speech samples. The boundary frequency can be adjusted at synthesis time for each state. Subjective listening tests show that the proposed approach is significantly preferred over the conventional approach using a single spectrum stream. Samples synthesised using the proposed approach sound less muffled and more natural.

机译：在基于基于隐马尔可夫模型（HMM）的统计参量语音合成中，问题之一在于参数的过度平滑，这会导致合成输出中的声音减弱。在本文中，我们提出了一种将高频频谱与低频频谱分开建模的方法。不携带太多语言信息的高频段使用非常大的决策树进行聚类，以便生成尽可能接近自然语音样本的参数。边界频率可以在合成时针对每个状态进行调整。主观听觉测试表明，与使用单个频谱流的常规方法相比，该方法明显更受欢迎。使用建议的方法合成的样本听起来不那么沉闷，更自然。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2016年|5160-5164|共5页
会议地点
作者
Kayoko Yanagisawa; Ranniery Maia; Yannis Stylianou;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
HMM-based speech synthesis; factorised speech representation; over-smoothing; sub-band;

机译：基于HMM的语音合成;分解语音表示;平滑;子带;

相似文献

外文文献
中文文献
专利

1. DBN-based Spectral Feature Representation for Statistical Parametric Speech Synthesis [J] . Ya-Jun Hu, Zhen-Hua Ling IEEE signal processing letters . 2016,第3期

机译：统计参数语音合成的基于DBN的频谱特征表示
2. Discriminative Multi-Stream Postfilters Based on Deep Learning for Enhancing Statistical Parametric Speech Synthesis [J] . Marvin Coto-Jiménez Biomimetics . 2021,第12期

机译：基于深度学习的判别多流破旧，用于增强统计参数致辞综合
3. On the impact of excitation and spectral parameters for expressive statistical parametric speech synthesis [J] . Ranniery Maia, Masami Akamine Computer speech and language . 2014,第5期

机译：激励和频谱参数对表达统计参数语音合成的影响
4. Multi-stream spectral representation for statistical parametric speech synthesis [C] . Kayoko Yanagisawa, Ranniery Maia, Yannis Stylianou IEEE International Conference on Acoustics, Speech and Signal Processing . 2016

机译：统计参数致辞合成多流谱表示
5. Statistical Parametric Speech Synthesis using Deep Learning Architectures [D] . Kang, Shiyin. 2016

机译：使用深度学习架构的统计参数致辞
6. Discriminative Multi-Stream Postfilters Based on Deep Learning for Enhancing Statistical Parametric Speech Synthesis [O] . Marvin Coto-Jiménez 2021

机译：基于深度学习的判别多流破旧用于增强统计参数致辞综合
7. Discriminative Multi-Stream Postfilters Based on Deep Learning for Enhancing Statistical Parametric Speech Synthesis [O] . Marvin Coto-Jiménez 2021

机译：基于深度学习的判别多流破旧，用于增强统计参数致辞综合

Multi-stream spectral representation for statistical parametric speech synthesis

摘要

著录项

相似文献

相关主题

期刊订阅