【24h】

Tracking Authentic and In-the-wild Emotions Using Speech

机译:使用语音追踪真实和狂野的情绪

获取原文
获取原文并翻译 | 示例

摘要

This first-of-its-kind study aims to track authentic affect representations in-the-wild. We use the `Graz Real-life Affect in the Street and Supermarket (GRAS2)' corpus featuring audiovisual recordings of random participants in non-laboratory conditions. The participants were initially unaware of being recorded. This paradigm enabled us to use a collection of a wide range of authentic, spontaneous and natural affective behaviours. Six raters annotated twenty-eight conversations averaging 2.5 minutes in duration, tracking the arousal and valence levels of the participants. We generate the gold standards through a novel robust Evaluator Weighted Estimator (EWE) formulation. We train Support Vector Regressors (SVR) and Recurrent Neural Networks (RNN) with the low-level-descriptors (LLDs) of the ComParE feature-set in different derived representations including bag-of-audio-words. Despite the challenging nature of this database, a fusion system achieved a highly promising concordance correlation coefficient (CCC) of.372 for arousal dimension, while RNNs achieved a top CCC of.223 in predicting valence, using a bag-of-features representation.
机译:这项首创的研究旨在追踪野外真实的情感表现。我们使用“格拉斯街和超市的真实生活影响(GRAS2)”语料库,在非实验室条件下对随机参与者进行视听记录。最初,参与者没有意识到自己被记录下来。这种范例使我们能够使用一系列真实,自发和自然的情感行为。六个评估者注释了平均持续时间为2.5分钟的28个会话,并跟踪了参与者的唤醒和效价水平。我们通过新颖的鲁棒评估器加权估计器(EWE)公式生成黄金标准。我们使用ComParE功能集的低级描述符(LLD)以不同的派生表示形式(包括音频包词)训练支持向量回归(SVR)和递归神经网络(RNN)。尽管该数据库具有挑战性,但融合系统在唤醒维度上实现了非常有前途的一致性相关系数(CCC)为372,而RNN在使用价位表示法的情况下,对价的预测最高CCC为223。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号