Towards Realizing Mandarin-Tibetan Bi-lingual Emotional Speech Synthesis with Mandarin Emotional Training Corpus

Peiwen Wu1; Hongwu Yang1; Zhenye Gan1

首页> 中文期刊>国际计算机前沿大会会议论文集 >Towards Realizing Mandarin-Tibetan Bi-lingual Emotional Speech Synthesis with Mandarin Emotional Training Corpus

Towards Realizing Mandarin-Tibetan Bi-lingual Emotional Speech Synthesis with Mandarin Emotional Training Corpus

开具论文收录证明 >>

期刊封面封底目录下载 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a method of hidden Markov model (HMM)-based Mandarin-Tibetan bi-lingual emotional speech synthesis by speaker adaptive training with a Mandarin emotional speech corpus.A one-speaker Tibetan neutral speech corpus, a multi-speaker Mandarin neutral speech corpus and a multi-speaker Mandarin emotional speech corpus are firstly employed to train a set of mixed language average acoustic models of target emotion by using speaker adaptive training.Then a one-speaker Mandarin neutral speech corpus or a one-speaker Tibetan neutral speech corpus is adopted to obtain a set of speaker dependent acoustic models of target emotion by using the speaker adap-tation transformation. The Mandarin emotional speech or the Tibetan emotional speech is finally synthesized from Mandarin speaker depen-dent acoustic models of target emotion or Tibetan speaker dependent acoustic models of target emotion. Subjective tests show that the aver-age emotional mean opinion score is 4.14 for Tibetan and 4.26 for Mandarin. The average mean opinion score is 4.16 for Tibetan and 4.28 for Mandarin. The average degradation opinion score is 4.28 for Tibetan and 4.24 for Mandarin. Therefore, the proposed method can synthesize both Tibetan speech and Mandarin speech with high naturalness and emotional expression by using only Mandarin emotional training speech corpus.

著录项

来源
《国际计算机前沿大会会议论文集》|2017年第2期|P.29-32|共4页
作者
Peiwen Wu1; Hongwu Yang1; Zhenye Gan1;
展开▼
作者单位

[1]College of Physics and Electronic Engineering, Northwest Normal University,Lanzhou 730070, China;

[1]College of Physics and Electronic Engineering, Northwest Normal University,Lanzhou 730070, China;

[1]College of Physics and Electronic Engineering, Northwest Normal University,Lanzhou 730070, China;

展开▼
原文格式 PDF
正文语种 CHI
中图分类社会科学丛书、文集、连续性出版物;
关键词
Mandarin-Tibetan; cross-lingual; emotional; speech synthesis; hidden Markov model (HMM); Speaker adaptive training; Mandarin-Tibetan cross-lingual speech synthesis; Emotional speech synthesis;
入库时间 2023-07-26 01:31:35

相似文献

中文文献
外文文献

1. Towards Realizing Sign Language to Emotional Speech Conversion by Deep Learning [J] . Nan Song1 ,Hongwu Yang12 ,Pengpeng Zhi1 . 国际计算机前沿大会会议论文集 . 2018,第002期
2. Emotional Speech Synthesis Based on Prosodic Feature Modification [J] . Ling He ,Hua Huang ,Margaret Lech . 工程（英文）（1947-3931） . 2013,第10期
3. Towards Realizing Sign Language-to-Speech Conversion by Combining Deep Learning and Statistical Parametric Speech Synthesis [J] . Xiaochun An1 ,Hongwu Yang1 ,Zhenye Gan1 . 国际计算机前沿大会会议论文集 . 2016,第001期
4. 中国社会科学院语言所普通话儿童语音库——CASS Mandarin Child Speech Corpus [C] . 高军 . 第十届中国语音学学术会议 . 2012
5. Speaker Recognition with Emotional Speech [A] . Ahmad Faraz Hussain . 2019

Towards Realizing Mandarin-Tibetan Bi-lingual Emotional Speech Synthesis with Mandarin Emotional Training Corpus

摘要

著录项

相似文献

相关主题

期刊订阅