Speaker-dependent model interpolation for statistical emotional speech synthesis

Chih-Yu Hsu; Chia-Ping Chen

首页> 外文期刊>EURASIP journal on audio, speech, and music processing >Speaker-dependent model interpolation for statistical emotional speech synthesis

【24h】

Speaker-dependent model interpolation for statistical emotional speech synthesis

机译：基于说话人的模型插值，用于统计情感语音合成

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this article, we propose a speaker-dependent model interpolation method for statistical emotional speech synthesis. The basic idea is to combine the neutral model set of the target speaker and an emotional model set selected from a pool of speakers. For model selection and interpolation weight determination, we propose to use a novel monophone-based Mahalanobis distance, which is a proper distance measure between two Hidden Markov Model sets. We design Latin-square evaluation to reduce the systematic bias in the subjective listening tests. The proposed interpolation method achieves sound performance on the emotional expressiveness, the naturalness, and the target speaker similarity. Moreover, such performance is achieved without the need to collect the emotional speech of the target speaker, saving the cost of data collection and labeling.

机译：在本文中，我们提出了一种用于统计情感语音合成的基于说话者的模型插值方法。基本思想是将目标说话者的中性模型集与从说话者池中选择的情感模型集结合起来。对于模型选择和插值权重确定，我们建议使用基于单音素的新颖Mahalanobis距离，这是两个隐马尔可夫模型集之间的适当距离度量。我们设计拉丁方评估以减少主观听力测试中的系统偏见。所提出的插值方法在情感表现力，自然性和目标说话人相似性方面达到了声音表现。此外，无需收集目标说话者的情感言论即可实现这种性能，从而节省了数据收集和标记的成本。

著录项

来源
《EURASIP journal on audio, speech, and music processing》 |2012年第1期|共10页
作者
Chih-Yu Hsu; Chia-Ping Chen;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类无线电电子学、电信技术;
关键词

相似文献

外文文献
中文文献
专利

1. Towards the Development of Speaker-Dependent and Speaker-Independent Hidden Markov Model-Based Thai Speech Synthesis [J] . Suphattharachai Chomphan Journal of computer sciences . 2009,第12期

机译：基于说话人和与说话人无关的隐马尔可夫模型的泰语语音合成的发展
2. Towards the Development of Speaker-Dependent and Speaker-Independent Hidden Markov Model-Based Thai Speech Synthesis | Science Publications [J] . Suphattharachai Chomphan Journal of computer sciences . 2009,第12期

机译：基于说话人和与说话人无关的隐马尔可夫模型的泰语语音合成的发展科学出版物
3. A Speaker-Dependent Approach to Single-Channel Joint Speech Separation and Acoustic Modeling Based on Deep Neural Networks for Robust Recognition of Multi-Talker Speech [J] . Yan-Hui Tu, Jun Du, Chin-Hui Lee Journal of signal processing systems for signal, image, and video technology . 2018,第7期

机译：基于说话者的基于深度神经网络的单通道联合语音分离和声学建模方法，用于多语音对话的鲁棒识别
4. A Statistical Approach for Modeling Prosody Features using POS Tags for Emotional Speech Synthesis [C] . Bulut, M., Sungbok Lee, . 2007

机译：一种使用POS标签进行情感语音合成的韵律特征建模的统计方法
5. Advances in speaker-dependent concatenative speech synthesis. [D] . Chappell, David Thomas. 2000

机译：说话者相关的级联语音合成技术的进步。
6. Discriminative Multi-Stream Postfilters Based on Deep Learning for Enhancing Statistical Parametric Speech Synthesis [O] . Marvin Coto-Jiménez 2021

机译：基于深度学习的判别多流破旧用于增强统计参数致辞综合
7. Speaker-dependent model interpolation for statistical emotional speech synthesis [O] . Chih-Yu Hsu, Chia-Ping Chen 2012

机译：基于说话人的模型插值，用于统计情感语音合成

Speaker-dependent model interpolation for statistical emotional speech synthesis

摘要

著录项

相似文献

相关主题

期刊订阅