A SUPERPOSED PROSODIC MODEL FOR CHINESE TEXT-TO-SPEECH SYNTHESIS

机译：汉语语篇合成的叠加韵律模型

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The paper presents the application of the trainable SFC superpositional prosodic model to Chinese. Within the SFC model, prosodic parameters (F0, syllabic lengthening) are interpreted as the superposition of overlapping multi-parametric contours. These contours are associated with high-level prosodic features operating at different scopes, such as tones, stress, prosodic boundary, part of speech of words, etc. Each feature label corresponds to a metalinguistic function (morphological, lexical, syntactic, attitudinal...) which is represented by a neural network. The observed contour is the sum of the outputs of the corresponding neural networks. An analysis-by-synthesis scheme is implemented for automatically learning. This model works well in the concatenation of neighbored units. The RMSE of F0 prediction is 2.34st (referenced to 200Hz), correlation is 0.86. Perceptual experiments show that the predicted prosody is quite appropriate and fluent.

机译：本文介绍了可训练的SFC叠加韵律模型在汉语中的应用。在SFC模型中，韵律参数（F0，音节加长）被解释为重叠的多参数轮廓的叠加。这些轮廓与在不同范围内运行的高级韵律特征（例如音调，重音，韵律边界，单词的词性等）相关联。每个特征标签都对应于元语言功能（形态，词法，句法，态度）。。），由神经网络表示。观察到的轮廓是相应神经网络输出的总和。实施了一种综合分析方案，用于自动学习。该模型在相邻单元的串联中效果很好。 F0预测的RMSE为2.34st（参考200Hz），相关系数为0.86。感知实验表明，预测的韵律是相当适当且流利的。

著录项

来源
《International Symposium on Chinese Spoken Language Processing; 20041215-18; Hong Kong(CN)》|2004年|P.177-180|共4页
会议地点 Hong Kong(CN)
作者
Gao-Peng Chen; Gerard Bailly; Qing-Feng Liu; Ren-Hua Wang;
展开▼
作者单位

Iflytek Speech Lab, University of Science Technology of China;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类程序语言、算法语言;
关键词

相似文献

外文文献
中文文献
专利

1. Cross-Dialect Adaptation Framework for Constructing Prosodic Models for Chinese Dialect Text-to-Speech Systems [J] . Chen-Yu Chiang Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2018,第1期

机译：跨方言适应框架构建汉语方言文本语音系统的韵律模型
2. Prosody modeling for syllable based text-to-speech synthesis using feedforward neural networks [J] . Reddy V. Ramu, Rao K. Sreenivasa Neurocomputing . 2016,第JANa1期

机译：使用前馈神经网络进行基于音节的语音合成的韵律建模
3. Modeling stylized invariance and local variability of prosody in text-to-speech synthesis [J] . Chu M, Zhao Y, Chang E Speech Communication . 2006,第6期

机译：在文本到语音合成中建模韵律的程式化不变性和局部可变性
4. A superposed prosodic model for Chinese text-to-speech synthesis [C] . Gao-Peng Chen, Bailly, G., . 2004

机译：中文文本语音合成的叠加韵律模型
5. Building a prosodically sensitive diphone database for a Korean text-to-speech synthesis system. [D] . Yoon, Kyuchul. 2005

机译：为韩国文字转语音合成系统建立一个对韵律敏感的diphone数据库。
6. Spread and Impact of COVID-19 in China: A Systematic Review and Synthesis of Predictions From Transmission-Dynamic Models [O] . Yi-Fan Lin, Qibin Duan, Yiguo Zhou, 2020

机译：Covid-19在中国的传播和影响：系统审查和综合传输动态模型的预测
7. Modeling Prosody Patterns for Chinese Expressive Text-to-Speech Synthesis [O] . Zhiyong Wu, Lianhong Cai, Helen M. Meng 2015

机译：中文表达文本到语音合成的韵律模式建模

A SUPERPOSED PROSODIC MODEL FOR CHINESE TEXT-TO-SPEECH SYNTHESIS

摘要

著录项

相似文献

相关主题

期刊订阅