首页> 外文会议>International Conference on Speech and Computer >Context Modeling for Cross-Corpus Dimensional Acoustic Emotion Recognition: Challenges and Mixup
【24h】

Context Modeling for Cross-Corpus Dimensional Acoustic Emotion Recognition: Challenges and Mixup

机译:Cross-Corpus尺寸声学情感识别的上下文建模:挑战和混合

获取原文

摘要

Recently, focus of research in the field of affective computing was shifted to spontaneous interactions and time-continuous annotations. Such data enlarge the possibility for real-world emotion recognition in the wild, but also introduce new challenges. Affective computing is a research area, where data collection is not a trivial and cheap task; therefore it would be rational to use all the data available. However, due to the subjective nature of emotions, differences in cultural and linguistic features as well as environmental conditions, combining affective speech data is not a straightforward process. In this paper, we analyze difficulties of automatic emotion recognition in time-continuous, dimensional scenario using data from RECOLA, SEMAINE and Cre-ativeIT databases. We propose to employ a simple but effective strategy called "mixup" to overcome the gap in feature-target and target-target covariance structures across corpora. We showcase the performance of our system in three different cross-corpus experimental setups: single-corpus training, two-corpora training and training on augmented (mixed up) data. Findings show that the prediction behavior of trained models heavily depends on the covariance structure of the training corpus, and mixup is very effective in improving cross-corpus acoustic emotion recognition performance of context dependent LSTM models.
机译:最近,情感计算领域的研究焦点转向自发的相互作用和时间连续注释。这些数据扩大了野外现实世界情感认可的可能性,但也引入了新的挑战。情感计算是一个研究领域,数据收集不是琐碎而便宜的任务;因此,使用可用的所有数据将是合理的。然而,由于情绪的主观性质,文化和语言特征的差异以及环境条件,组合情感语音数据并不是一个直接的过程。在本文中,我们使用来自Recola,Semaine和Cre-Atietit数据库的数据分析了时间连续,尺寸情景中的自动情感识别的困难。我们建议采用一个简单但有效的策略,称为“混合”,以克服跨学数的特征目标和目标目标协方差结构中的差距。我们在三种不同的Cross-Corpus实验设置中展示了我们系统的性能:单语料库培训,两个语料库培训和增强(混合)数据的培训。调查结果表明,训练型模型的预测行为大量取决于训练语料库的协方差结构,混合在提高上下文依赖LSTM模型的交叉语音声学识别性能方面非常有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号