Visual Scene-aware Hybrid Neural Network Architecture for Video-based Facial Expression Recognition

机译：基于视频的面部表情识别的视觉场景感知混合神经网络架构

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

With rapid development of deep learning, facial expression recognition (FER) technology has made considerable progress recently. However, since conventional FER techniques are mainly designed and learned for videos which are artificially acquired in a limited environment, they may not operate robustly on videos acquired in a wild environment. To solve this problem, this paper proposes a scene-aware hybrid neural network (NN) having a novel combination of three-dimensional (3D) convolutional NN (CNN), 2D CNN and recurrent NN (RNN). The characteristics of the proposed network are as follows. First, we extract video-based global features and frame-based local features at the same time. In detail, the latent features containing the overall visual scene of a given video are extracted by 3D CNN with auxiliary classifier, and fine-tuned 2D CNN is adopted to extract latent features containing small details from each frame. Second, RNN not only performs temporal domain learning, but also feature-wise fuses two latent features extracted from the networks. For effective fusion, we also present three RNN schemes. Third, the proposed network, in which the above-mentioned methods collaborate, works very robust in a wild environment as well as in a limited environment. Extensive experiments show that the proposed network provides an average accuracy of 49.9% for AFEW dataset, i.e., a representative wild dataset, and an amazing accuracy of 98.2% for another CK+ dataset. We also show that the proposed network outperforms the state-of-the-art network(s).

机译：随着深度学习的快速发展，面部表情识别（FER）技术最近取得了相当大的进展。然而，由于传统的FER技术主要设计和学习了在有限环境中人工地获取的视频，因此它们可能无法稳健地在野外环境中获取的视频中运行。为了解决这个问题，本文提出了一种现场感知的混合神经网络（NN），其具有三维（3D）卷积NN（CNN），2D CNN和反复间NN（RNN）的新组合。所提出的网络的特征如下。首先，我们同时提取基于视频的全局功能和基于帧的本地功能。详细地，包含给定视频的整体视觉场景的潜在特征由带有辅助分类器的3D CNN提取，采用微调的2D CNN来提取包含来自每个帧的小细节的潜在特征。其次，RNN不仅执行时间域学习，而且还具有从网络中提取的两个潜在功能的功能。有效融合，我们还提出了三个RNN计划。第三，所提出的网络，其中上述方法合作，在野外环境以及有限的环境中工作非常强大。广泛的实验表明，所提出的网络为AFEW DataSet，即代表性野生数据集提供的平均精度为49.9％，另一个CK + DataSet的惊人精度为98.2％。我们还表明，所提出的网络优于最先进的网络。

著录项

来源
《International Conference on Automatic Face and Gesture Recognition》|2019年|753p|共8页
会议地点
作者
Min Kyu Lee; Dong Yoon Choi; Dae Ha Kim; Byung Cheol Song;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工）;
关键词
convolutional neural nets; emotion recognition; face recognition; feature extraction; image classification; image fusion; image representation; learning (artificial intelligence); recurrent neural nets; stereo image processing; video signal processing;

机译：卷积神经网络;情感识别;面部识别;特征提取;图像分类;图像融合;图像表示;学习（人工智能）;经常性神经网络;立体图像处理;视频信号处理;

相似文献

外文文献
中文文献
专利

1. Adaptive metric learning with deep neural networks for video-based facial expression recognition [J] . Liu Xiaofeng, Ge Yubin, Yang Chao, Journal of electronic imaging . 2018,第1期

机译：深度神经网络的自适应度量学习，用于基于视频的面部表情识别
2. Fusing HOG and convolutional neural network spatial–temporal features for video-based facial expression recognition [J] . Image Processing, IET . 2020,第1期

机译：融合HOG和卷积神经网络时空特征以基于视频的面部表情识别
3. Cloud basis function neural network: A modified RBF network architecture for holistic facial expression recognition [J] . De Silva CR, Ranganath S, De Silva LC Pattern Recognition: The Journal of the Pattern Recognition Society . 2008,第4期

机译：云基函数神经网络：改进的RBF网络架构，用于整体面部表情识别
4. Visual Scene-aware Hybrid Neural Network Architecture for Video-based Facial Expression Recognition [C] . Min Kyu Lee, Dong Yoon Choi, Dae Ha Kim, International Conference on Automatic Face and Gesture Recognition . 2019

机译：基于视频的面部表情识别的视觉场景感知混合神经网络架构
5. Deep Convolutional Neural Network for Facial Expression Recognition Using Facial Parts [D] . Nwosu, Lucy. 2017

机译：使用面部部件的面部表情识别深卷积神经网络
6. Visual Scene-Aware Hybrid and Multi-Modal Feature Aggregation for Facial Expression Recognition [O] . Min Kyu Lee, Dae Ha Kim, Byung Cheol Song 2020

机译：面部表情识别的视觉场景感知混合和多模态特征聚合
7. Recognition of facial action units from video streams with recurrent neural networks : a new paradigm for facial expression recognition [O] . Vadapalli Hima Bindu 2011

机译：递归神经网络从视频流中识别面部动作单元：面部表情识别的新范例

Visual Scene-aware Hybrid Neural Network Architecture for Video-based Facial Expression Recognition

摘要

著录项

相似文献

相关主题

期刊订阅