...
首页> 外文期刊>電子情報通信学会技術研究報告. 音声. Speech >Analyzing the impact of including listener perception annotations in RNN-based emotional speech synthesis
【24h】

Analyzing the impact of including listener perception annotations in RNN-based emotional speech synthesis

机译:在基于RNN的情感语音合成中的监督者感知注释在内的影响分析

获取原文
获取原文并翻译 | 示例

摘要

This paper investigates simultaneous modeling of multiple emotions in DNN-based expressive speech synthesis, and how to represent the emotional labels, such as emotional class and strength, for this task. Our goal is to answer two questions: First, what is the best way to annotate speech data with multiple emotions? Second, how should the emotional information be represented as labels for supervised DNN training? We evaluate on a large-scale corpus of emotional speech from a professional actress, additionally annotated with perceived emotional labels from crowd-sourced listeners. By comparing DNN-based speech synthesizers that utilize different emotional representations, we assess the impact of these representations and design decisions on human emotion recognition rates.
机译:本文调查了基于DNN的表达语音合成中多种情绪的同时建模,以及如何代表这项任务的情绪标签,如情绪阶级和力量。 我们的目标是回答两个问题:首先,用多种情绪注释语音数据的最佳方式是什么? 其次,情绪信息应该如何表示为监督DNN培训的标签? 我们评估了从专业女演员的大规模情绪讲话中,另外用来自人群兴趣的听众的感知情绪标签诠释。 通过比较利用不同情绪表现的DNN的语音合成器,我们评估这些陈述和设计决策对人类情感识别率的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号