Improved Prosody Generation by Maximizing Joint Probability of State and Longer Units

Yao Qian; Zhizheng Wu; Boyang Gao; Soong F.K.

首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Improved Prosody Generation by Maximizing Joint Probability of State and Longer Units

【24h】

Improved Prosody Generation by Maximizing Joint Probability of State and Longer Units

机译：通过最大化州和州长部队的联合概率来改善韵律生成

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The current state-of-the-art hidden Markov model (HMM)-based text-to-speech (TTS) can produce highly intelligible, synthesized speech with decent segmental quality. However, its prosody, especially at phrase or sentence level, still tends to be bland. This blandness is partially due to the fact that the state-based HMM is inadequate in capturing global, hierarchical suprasegmental information in speech signals. In this paper, to improve the TTS prosody, longer units are first explicitly modeled with appropriate parametric distributions. The resultant models are then integrated with the state-based baseline models in generating better prosody by maximizing the joint probability. Experimental results in both Mandarin and English show consistent improvements over our baseline system with only state-based prosody model. The improvements are both objectively measurable and subjectively perceivable.

机译：当前基于最新隐马尔可夫模型（HMM）的文本语音转换（TTS）可以产生具有清晰片段质量的高清晰度，合成语音。但是，它的韵律，尤其是短语或句子层面的韵律仍然趋于平淡。这种无聊的部分原因是基于状态的HMM不足以捕获语音信号中的全局，分层的超分段信息。在本文中，为了改善TTS韵律，首先使用适当的参数分布对较长的单元进行显式建模。然后将所得模型与基于状态的基线模型集成在一起，以通过使联合概率最大化来产生更好的韵律。以普通话和英语进行的实验结果表明，仅基于状态的韵律模型对我们的基准系统具有持续的改进。这些改进在客观上和主观上都是可以衡量的。

著录项

来源
《Audio, Speech, and Language Processing, IEEE Transactions on 》 |2011年第6期| p.1702-1710| 共9页
作者
Yao Qian; Zhizheng Wu; Boyang Gao; Soong F.K.;
展开▼
作者单位

Microsoft Res. Asia, Beijing, China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Discrete cosine transforms (DCTs); speech synthesis; statistical distributions;

机译：离散余弦变换（DCT）;语音合成;统计分布;

相似文献

外文文献
中文文献
专利

1. Speech-to-Lip Movement Synthesis by Maximizing Audio-Visual Joint Probability Based on the EM Algorithm [J] . SATOSHI NAKAMURA, ELI YAMAMOTO Journal of VLSI signal processing . 2001 ,第1a2期

机译：基于EM算法的视听联合概率最大化口对口运动合成
2. Intelligent Flight-Trajectory Generation to Maximize Safe-Outcome Probability After a Distress Event [J] . Nesrin Sarigul-Klijn, R. Rapetti, A. Jordan, Journal of Aircraft . 2010 ,第1期

机译：遇险事件后智能飞行轨迹生成可最大程度提高安全后果概率
3. Intelligent Flight-Trajectory Generation to Maximize Safe-Outcome Probability After a Distress Event [J] . Nesrin Sarigul-Klijn, R. Rapetti, A. Jordan, Journal of Aircraft . 2010 ,第1期

机译：遇险事件后智能飞行轨迹生成可最大程度提高安全后果概率
4. Improved prosody generation by maximizing joint likelihood of state and longer units [C] . Yao Qian, Zhizheng Wu, Soong, F.K. IEEE International Conference on Acoustics, Speech and Signal Processing;ICASSP 2009 . 2009

机译：通过最大化状态和更长单位的联合可能性来改善韵律生成
5. Assessing the potential of natural microbial communities to improve a second-generation biofuels platform [D] . Hammett, Amy Jo Macbey 2011

机译：评估天然微生物群落改善第二代生物燃料平台的潜力
6. Matthews coefficient probabilities: Improved estimates for unit cell contents of proteins DNA and protein–nucleic acid complex crystals [O] . Katherine A. Kantardjieff, Bernhard Rupp 2003

机译：Matthews系数概率：对蛋白质DNA和蛋白质-核酸复合物晶体的晶胞含量进行改进的估计
7. Fitting Precipitation Particle Size–Velocity Data to Mixed Joint Probability Density Function with an Expectation Maximization Algorithm [O] . Yuta Katsuyama, Masaru Inatsu 2020

机译：沉淀沉淀粒度 - 速度数据以期望最大化算法的混合接头概率密度函数

Improved Prosody Generation by Maximizing Joint Probability of State and Longer Units

摘要

著录项

相似文献

相关主题

期刊订阅