首页> 外文会议>ACM international conference on multimodal interaction >Combining Video, Audio and Lexical Indicators of Affect in Spontaneous Conversation via Particle Filtering
【24h】

Combining Video, Audio and Lexical Indicators of Affect in Spontaneous Conversation via Particle Filtering

机译:通过粒子滤波结合在自发对话中的视频,音频和词汇指示器

获取原文

摘要

We present experiments on fusing facial video, audio and lexical indicators for affect estimation during dyadic conversations. We use temporal statistics of texture descriptors extracted from facial video, a combination of various acoustic features, and lexical features to create regression based affect estimators for each modality. The single modality regressors are then combined using particle filtering, by treating these independent regression outputs as measurements of the affect states in a Bayesian filtering framework, where previous observations provide prediction about the current state by means of learned affect dynamics. Tested on the Audio-visual Emotion Recognition Challenge dataset, our single modality estimators achieve substantially higher scores than the official baseline method for every dimension of affect. Our filtering-based multi-modality fusion achieves correlation performance of 0.344 (baseline: 0.136) and 0.280 (baseline: 0.096) for the fully continuous and word level sub challenges, respectively.
机译:我们在多达对话期间对融合面部视频,音频和词汇指标进行融合面部视频,音频和词汇指标进行实验。我们使用从面部视频中提取的纹理描述符的时间统计,各种声学特征的组合和词汇特征,以创建基于回归的影响每个模态的影响。然后,通过将这些独立回归输出处理在贝叶斯滤波框架中的影响状态的测量中,使用粒子滤波来组合单个模态回归器,其中先前观察通过学习的影响动态提供关于当前状态的预测。在视听情感识别挑战数据集上测试,我们的单个模态估计器比每一系列影响的官方基线方法达到更高的分数。我们的滤波基多种式融合可以分别实现0.344(基线:0.136)和0.280(基线:0.096)的相关性能,分别为完全连续和字级群挑战。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号