首页> 外文期刊>Cybernetics, IEEE Transactions on >Affective State Level Recognition in Naturalistic Facial and Vocal Expressions
【24h】

Affective State Level Recognition in Naturalistic Facial and Vocal Expressions

机译:自然主义的面部表情和人声表达中的情感状态水平识别

获取原文
获取原文并翻译 | 示例
           

摘要

Naturalistic affective expressions change at a rate much slower than the typical rate at which video or audio is recorded. This increases the probability that consecutive recorded instants of expressions represent the same affective content. In this paper, we exploit such a relationship to improve the recognition performance of continuous naturalistic affective expressions. Using datasets of naturalistic affective expressions (AVEC 2011 audio and video dataset, PAINFUL video dataset) continuously labeled over time and over different dimensions, we analyze the transitions between levels of those dimensions (e.g., transitions in pain intensity level). We use an information theory approach to show that the transitions occur very slowly and hence suggest modeling them as first-order Markov models. The dimension levels are considered to be the hidden states in the Hidden Markov Model (HMM) framework. Their discrete transition and emission matrices are trained by using the labels provided with the training set. The recognition problem is converted into a best path-finding problem to obtain the best hidden states sequence in HMMs. This is a key difference from previous use of HMMs as classifiers. Modeling of the transitions between dimension levels is integrated in a multistage approach, where the first level performs a mapping between the affective expression features and a soft decision value (e.g., an affective dimension level), and further classification stages are modeled as HMMs that refine that mapping by taking into account the temporal relationships between the output decision labels. The experimental results for each of the unimodal datasets show overall performance to be significantly above that of a standard classification system that does not take into account temporal relationships. In particular, the results on the AVEC 2011 audio dataset outperform all other systems presented at the international competition.
机译:自然的情感表达的变化速率比录制视频或音频的典型速率要慢得多。这增加了连续记录的表情瞬间代表相同情感内容的可能性。在本文中,我们利用这种关系来提高连续自然主义情感表达的识别性能。使用随时间推移和在不同维度上连续标记的自然情感表达的数据集(AVEC 2011音频和视频数据集,PAINFUL视频数据集),我们分析了这些维度级别之间的转换(例如,疼痛强度级别的转换)。我们使用信息论方法来证明过渡过程非常缓慢,因此建议将其建模为一阶马尔可夫模型。在隐马尔可夫模型(HMM)框架中,维度级别被视为隐藏状态。通过使用训练集随附的标签来训练它们的离散过渡和发射矩阵。识别问题被转换成最佳路径查找问题,以获得HMM中最佳的隐藏状态序列。这与以前使用HMM作为分类器的主要区别。维度级别之间的转换建模以多阶段方法集成,其中第一级别在情感表达特征和软决策值(例如情感维度级别)之间执行映射,而其他分类阶段则建模为可改进的HMM通过考虑输出决策标签之间的时间关系进行映射。每个单峰数据集的实验结果表明,整体性能显着高于不考虑时间关系的标准分类系统。特别是,AVEC 2011音频数据集的结果优于国际比赛中展示的所有其他系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号