首页> 外文会议>International Conference on Automatic Face and Gesture Recognition >Integrating Facial Images, Speeches and Time for Empathy Prediction
【24h】

Integrating Facial Images, Speeches and Time for Empathy Prediction

机译:整合面部图像,语音和时间以进行共情预测

获取原文

摘要

We propose a multi-modal method for the One-Minute Empathy Prediction competition. First, we use bottleneck residual and fully-connected network to encode facial images and speeches of the listener. Second, we propose to use the current time stage as a temporal feature and encoded it into the proposed multi-modal network. Third, we select a subset training data based on its performance of empathy prediction on the validation data. Experimental results on the testing set show that the proposed method outperforms the baseline methods significantly according to the CCC metric (0.14 vs 0.06).
机译:我们为一分钟移情预测比赛提出了一种多模式方法。首先,我们使用瓶颈残差和完全连接的网络对听众的面部图像和语音进行编码。其次,我们建议将当前时间段用作时间特征,并将其编码到建议的多模式网络中。第三,我们基于对验证数据的共情预测性能选择一个子集训练数据。测试集上的实验结果表明,根据CCC指标(0.14 vs 0.06),提出的方法明显优于基线方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号