...
首页> 外文期刊>IEE Proceedings. Part K >Neural network-based F0 text-to-speech synthesiser for Mandarin
【24h】

Neural network-based F0 text-to-speech synthesiser for Mandarin

机译:基于神经网络的普通话F0语音合成器

获取原文
获取原文并翻译 | 示例

摘要

A neural-network-based approach to synthesising F0 information for Mandarin text-to-speech is discussed. The basic idea is to use neural networks to model the relationship between linguistic features. Extracted from input text and parameters representing the pitch contour of syllables. Two MLPs are used to separately synthesise the mean and shape of pitch contour, using different linguistic features. A large set of utterances is employed to train these MLPs using the well known back-propagation algorithm. Pronunciation rules for generating F0 information are automatically learned and implicitly memorised by the MLPs. In the synthesis, parameters representing the mean and shape of the pitch contour of each syllable are generated using linguistic features extracted from the given input text. Simulation results confirmed that this is a promising approach for F0 synthesis. The resulting synthesised pitch contours of syllables match well with their original counterparts. Average root mean square errors of 0.94 ms/frame and 1.00 ms/frame were achieved.
机译:讨论了一种基于神经网络的普通话语音合成F0信息的方法。基本思想是使用神经网络来建模语言特征之间的关系。从输入文本和代表音节音高轮廓的参数中提取。两个MLP用于使用不同的语言特征分别合成音高轮廓的平均值和形状。使用众所周知的反向传播算法,使用大量话语来训练这些MLP。 MLP自动学习并隐式存储用于生成F0信息的发音规则。在合成中,使用从给定输入文本中提取的语言特征来生成代表每个音节音高轮廓的平均值和形状的参数。仿真结果证实,这是用于F0合成的有前途的方法。音节的合成音高等高线与其原始音节非常吻合。实现了0.94 ms /帧和1.00 ms /帧的平均均方根误差。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号