首页> 外文期刊>Journal of visual communication & image representation >Video facial emotion recognition based on local enhanced motion history image and CNN-CTSLSTM networks
【24h】

Video facial emotion recognition based on local enhanced motion history image and CNN-CTSLSTM networks

机译:基于本地增强运动历史图像和CNN-CTSLSTM网络的视频面部情感识别

获取原文
获取原文并翻译 | 示例

摘要

This paper focuses on the issue of recognition of facial emotion expressions in video sequences and proposes an integrated framework of two networks: a local network, and a global network, which are based on local enhanced motion history image (LEMHI) and CNN-LSTM cascaded networks respectively. In the local network, frames from unrecognized video are aggregated into a single frame by a novel method, LEMHI. This approach improves MHI by using detected human facial landmarks as attention areas to boost local value in difference image calculation, so that the action of crucial facial unit can be captured effectively. Then this single frame will be fed into a CNN network for prediction. On the other hand, an improved CNN-LSTM model is used as a global feature extractor and classifier for video facial emotion recognition in the global network. Finally, a random search weighted summation strategy is conducted as late-fusion fashion to final predication. Our work also offers an insight into networks and visible feature maps from each layer of CNN to decipher which portions of the face influence the networks' predictions. Experiments on the AFEW, CK+ and MMI datasets using subject-independent validation scheme demonstrate that the integrated framework of two networks achieves a better performance than using individual network separately. Compared with state-of-the-arts methods, the proposed framework demonstrates a superior performance. (C) 2018 Published by Elsevier Inc.
机译:本文侧重于视频序列中面部情感表达的识别问题,提出了两个网络的综合框架:本地网络,以及基于本地增强运动历史图像(LEMHI)和CNN-LSTM级联的全局网络网络分别。在本地网络中,通过新颖的方法LEMHI将来自无法识别的视频的帧聚集成单个帧。这种方法通过使用检测到的人面部地标作为引起差值图像计算的关注区域来改善MHI,从而可以有效地捕获关键面部单元的作用。然后将该单帧馈送到CNN网络以进行预测。另一方面,改进的CNN-LSTM模型用作全球网络中的视频面部情感识别的全局特征提取器和分类器。最后,随机搜索加权求和策略作为最终预测作为临时融合方式。我们的工作还提供了进入网络的洞察力和从每层CNN的可见功能映射,以破译面部的哪个部分影响网络的预测。使用主题验证方案的AFEW,CK +和MMI数据集的实验表明,两个网络的综合框架比单独使用单个网络实现更好的性能。与最先进的方法相比,所提出的框架表现出卓越的性能。 (c)2018由elsevier公司出版

著录项

  • 来源
  • 作者单位

    Hefei Univ Technol Sch Comp & Informat Hefei Anhui Peoples R China|Anhui Prov Key Lab Affect Comp & Adv Intelligent Hefei 230009 Anhui Peoples R China;

    Hefei Univ Technol Sch Comp & Informat Hefei Anhui Peoples R China|Anhui Prov Key Lab Affect Comp & Adv Intelligent Hefei 230009 Anhui Peoples R China;

    Hefei Univ Technol Sch Comp & Informat Hefei Anhui Peoples R China|Anhui Prov Key Lab Affect Comp & Adv Intelligent Hefei 230009 Anhui Peoples R China;

    Hefei Univ Technol Sch Comp & Informat Hefei Anhui Peoples R China;

    Hefei Univ Technol Sch Comp & Informat Hefei Anhui Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Video emotion recognition; Motion history image; LSTM; Facial landmarks;

    机译:视频情感识别;运动历史图像;LSTM;面部地标;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号