首页> 外文会议> >A Statistical Approach for Modeling Prosody Features using POS Tags for Emotional Speech Synthesis

【24h】

A Statistical Approach for Modeling Prosody Features using POS Tags for Emotional Speech Synthesis

机译：一种使用POS标签进行情感语音合成的韵律特征建模的统计方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Deriving statistical models for emotional speech processing is a challenging problem because of the highly varying nature of emotion expressions. We address this problem by modeling prosodic parameter differences at the part of speech (POS) level for emotional utterances for the purpose of emotional speech synthesis. Synthesis at the POS level is appealing because POS tags carry salient information conveying speech prominence. Analysis of energy, duration and F0 differences between matching neutral-angry, neutral-sad and neutral-happy emotional utterance pairs shows that Gaussian distributions can be used to model the parameter differences. Pairwise comparisons of POS features reveal that it is more probable that the normalized mean and median energy of sad POS tags are larger than neutral, angry or happy POS tags. They also show that for particular tags it is more likely that angry emotion has higher F0 median than happy emotion, and that sad emotion has higher F0 median than neutral emotion. Experiments of conversion of neutral speech into emotional speech using the Gaussian probability functions provide helpful insights into the application of statistical models in speech synthesis

机译：由于情感表达的性质各不相同，因此得出用于情感语音处理的统计模型是一个具有挑战性的问题。我们通过在语音（POS）级别上针对情感话语建模韵律参数差异来解决此问题，以达到情感语音合成的目的。 POS级别的合成很吸引人，因为POS标签带有传达语音突出的显着信息。对匹配的中性愤怒，中性悲伤和中性快乐言语对之间的能量，持续时间和F0差异的分析表明，高斯分布可用于建模参数差异。 POS功能的成对比较显示，悲伤POS标签的标准化均值和中值能量比中性，愤怒或快乐POS标签更大的可能性。他们还表明，对于特定的标签，生气的情绪比幸福的情绪具有更高的F0中位数，而悲伤的情绪比中立的情绪具有更高的F0中位数。使用高斯概率函数将中性语音转换为情感语音的实验为统计模型在语音合成中的应用提供了有益的见解

著录项

来源
《》|2007年|1237-1240|共4页
会议地点
作者
Bulut; M.; Sungbok Lee; Narayanan; S.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Gaussian distribution; speech synthesis; statistical analysis; F0 differences; Gaussian distributions; Gaussian probability functions; POS tags; emotional speech processing; emotional speech synthesis; emotional utterances; median energy; neutral speech; normalized m;

机译：高斯分布;语音合成;统计分析; F0差异;高斯分布;高斯概率函数; POS标签;情感语音处理;情感语音合成;情感话语;中能量;中性语音;归一化m;

相似文献

外文文献
中文文献
专利

1. Corpus-based generation of prosodic features for emotional speech synthesis based on a generation process model and its evaluation [J] . Toshiya Katsura, Keikichi Hirose, Nobuaki Minematsu 電子情報通信学会技術研究報告. 音声. Speech . 2002,第749期

机译：基于生成过程模型的基于语料库的语音特征韵律合成
2. Corpus-based generation of prosodic features for emotional speech synthesis based on a generation process model and its evaluation [J] . Toshiya Katsura, Keikichi Hirose, Nobuaki Minematsu 電子情報通信学会技術研究報告. 音声. Speech . 2002,第749期

机译：基于语料库的基于生成过程模型的情绪语音合成的韵律特征及其评价
3. Emotional Speech Synthesis Based on Prosodic Feature Modification [J] . Ling He, Hua Huang, Margaret Lech Engineering . 2013,第10期

机译：基于韵律特征修饰的情感语音合成
4. A Statistical Approach for Modeling Prosody Features using POS Tags for Emotional Speech Synthesis [C] . Bulut M., Sungbok Lee, Narayanan S. . -1

机译：一种使用POS标签进行情感语音合成的韵律特征建模的统计方法
5. Prosodic features of verbal irony in spontaneous speech. [D] . Bryant, Gregory Alan. 2004

机译：自发性言语讽刺的韵律特征。
6. Statistical Approach to Incorporating Experimental Variability into a Mathematical Model of the Voltage-Gated Na+ Channel and Human Atrial Action Potential [O] . Daniel Gratz, Alexander J Winkle, Seth H Weinberg, 2021

机译：将实验变异掺入电压门控Na +通道和人心房动作电位的数学模型中的统计方法
7. PROSODY ANALYSIS AND MODELING FOR EMOTIONAL SPEECH SYNTHESIS [O] . 2014

机译：情绪语音合成的前景分析与建模

A Statistical Approach for Modeling Prosody Features using POS Tags for Emotional Speech Synthesis

摘要

著录项

相似文献

相关主题

期刊订阅