首页> 外文会议>Spoken Language Technology Workshop >Domain Generalization with Triplet Network for Cross-Corpus Speech Emotion Recognition
【24h】

Domain Generalization with Triplet Network for Cross-Corpus Speech Emotion Recognition

机译:具有三联网网络的域概括,用于跨语料库语音情感识别

获取原文

摘要

Domain generalization is a major challenge for cross-corpus speech emotion recognition. The recognition performance built on "seen" source corpora is inevitably degraded when the systems are tested against "unseen" target corpora that have different speakers, channels, and languages. We present a novel framework based on a triplet network to learn more generalized features of emotional speech that are invariant across multiple corpora. To reduce the intrinsic discrepancies between source and target corpora, an explicit feature transformation based on the triplet network is implemented as a preprocessing step. Extensive comparison experiments are carried out on three emotional speech corpora; two English corpora, and one Japanese corpus. Remarkable improvements of up-to 35.61% are achieved for all cross-corpus speech emotion recognition, and we show that the proposed framework using the triplet network is effective for obtaining more generalized features across multiple emotional speech corpora.
机译:域概括是交叉语料库语音情感认可的主要挑战。当系统经过有不同扬声器,渠道和语言的“看不见的”目标语料库测试时,建立在“看到”源语料库上建立的识别表现不可避免地降低。我们介绍了一个基于三联网网络的新颖框架,以了解多大语料中不变的情绪语音的更广泛的特征。为了降低源代码和目标语料库之间的内在差异,基于Triplet网络的显式功能转换被实现为预处理步骤。广泛的比较实验是在三个情绪语音上进行的;两个英文语料库和一个日本语料库。所有交叉语料库语音情感识别都会实现高达35.61%的显着改善,我们表明,使用Triplet网络的建议框架可有效地在多个情绪语音上获得更多的广义特征。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号