首页> 外文期刊>電子情報通信学会技術研究報告. 応用音響. Engineering Acoustics >Do prosodic manual annotations matter for Japanese speech synthesis systems with WaveNet vocoder?
【24h】

Do prosodic manual annotations matter for Japanese speech synthesis systems with WaveNet vocoder?

机译:Do prosodic manual annotations matter for Japanese speech synthesis systems with WaveNet vocoder?

获取原文
获取原文并翻译 | 示例
       

摘要

We investigated the impact of noisy linguistics features on the performance of a Japanese neural network based speech synthesis system using a WaveNet vocoder. This investigation compared the ideal system using manually corrected linguistic features in training and test sets against a few other systems using corrupted linguistic features. Both subjective and objective results demonstrate that corrupted linguistic features, especially those in the test set, affected the system's performance significantly in a statistical sense due to mismatched conditions between training and test sets. Interestingly, while an utterance-level Turing test shows that listeners had a difficult time to differentiate synthetic speech from natural speech, it further indicates that adding noise to the linguistic features in the training set partially can reduce the mismatched effect, regularize the model and help the system perform better when the test set linguistic features are noisy.

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号