...
首页> 外文期刊>The Journal of the Acoustical Society of America >Emotional speech acoustic model for Malay: Iterative versus isolated unit training
【24h】

Emotional speech acoustic model for Malay: Iterative versus isolated unit training

机译:马来语的情感语音声学模型:迭代与孤立单元训练

获取原文
获取原文并翻译 | 示例

摘要

The ability of speech synthesis system to synthesize emotional speech enhances the user's experience when using this kind of system and its related applications. However, the development of an emotional speech synthesis system is a daunting task in view of the complexity of human emotional speech. The more recent state-of-the-art speech synthesis systems, such as the one based on hidden Markov models, can synthesize emotional speech with acceptable naturalness with the use of a good emotional speech acoustic model. However, building an emotional speech acoustic model requires adequate resources including segment-phonetic labels of emotional speech, which is a problem for many under-resourced languages, including Malay. This research shows how it is possible to build an emotional speech acoustic model for Malay with minimal resources. To achieve this objective, two forms of initialization methods were considered: iterative training using the deterministic annealing expectation maximization algorithm and the isolated unit training. The seed model for the automatic segmentation is a neutral speech acoustic model, which was transformed to target emotion using two transformation techniques: model adaptation and context-dependent boundary refinement. Two forms of evaluation have been performed: an objective evaluation measuring the prosody error and a listening evaluation to measure the naturalness of the synthesized emotional speech.
机译:语音合成系统合成情感语音的能力增强了用户在使用这种系统及其相关应用程序时的体验。然而,鉴于人类情感语音的复杂性,情感语音合成系统的开发是艰巨的任务。最新的先进语音合成系统(例如基于隐马尔可夫模型的系统)可以通过使用良好的情感语音声学模型来合成具有可接受自然度的情感语音。但是,建立情感语音声学模型需要足够的资源,包括情感语音的段语音标签,这对于包括马来语在内的许多资源贫乏的语言来说都是一个问题。这项研究表明如何用最少的资源为马来人建立情感语音声学模型。为了实现此目标,考虑了两种形式的初始化方法:使用确定性退火期望最大化算法的迭代训练和孤立单元训练。用于自动分割的种子模型是中性语音声学模型,该模型使用两种转换技术转换为目标情感:模型自适应和上下文相关的边界细化。已经执行了两种形式的评估:测量韵律错误的客观评估和测量合成情感语音的自然性的听觉评估。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号