首页> 美国卫生研究院文献>Sensors (Basel Switzerland) >Facial Expressions Recognition for Human–Robot Interaction Using Deep Convolutional Neural Networks with Rectified Adam Optimizer
【2h】

Facial Expressions Recognition for Human–Robot Interaction Using Deep Convolutional Neural Networks with Rectified Adam Optimizer

机译:基于深度卷积神经网络和经过改进的Adam优化器的人机交互面部表情识别

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The interaction between humans and an NAO robot using deep convolutional neural networks (CNN) is presented in this paper based on an innovative end-to-end pipeline method that applies two optimized CNNs, one for face recognition (FR) and another one for the facial expression recognition (FER) in order to obtain real-time inference speed for the entire process. Two different models for FR are considered, one known to be very accurate, but has low inference speed (faster region-based convolutional neural network), and one that is not as accurate but has high inference speed (single shot detector convolutional neural network). For emotion recognition transfer learning and fine-tuning of three CNN models (VGG, Inception V3 and ResNet) has been used. The overall results show that single shot detector convolutional neural network (SSD CNN) and faster region-based convolutional neural network (Faster R-CNN) models for face detection share almost the same accuracy: 97.8% for Faster R-CNN on PASCAL visual object classes (PASCAL VOCs) evaluation metrics and 97.42% for SSD Inception. In terms of FER, ResNet obtained the highest training accuracy (90.14%), while the visual geometry group (VGG) network had 87% accuracy and Inception V3 reached 81%. The results show improvements over 10% when using two serialized CNN, instead of using only the FER CNN, while the recent optimization model, called rectified adaptive moment optimization (RAdam), lead to a better generalization and accuracy improvement of 3%-4% on each emotion recognition CNN.
机译:本文基于一种创新的端到端流水线方法,介绍了人类与使用深度卷积神经网络(NAN)的NAO机器人之间的交互,该方法应用了两种优化的CNN,一种用于人脸识别(FR),另一种用于人脸识别。面部表情识别(FER),以便获得整个过程的实时推理速度。考虑了两种不同的FR模型,一种已知非常精确,但推理速度较低(基于区域的卷积神经网络速度更快),另一种模型不那么准确但推理速度较高(单发检测器卷积神经网络) 。对于情感识别,已经使用了三种CNN模型(VGG,Inception V3和ResNet)的传递学习和微调。整体结果表明,用于面部检测的单发检测器卷积神经网络(SSD CNN)模型和基于区域的快速卷积神经网络(Faster R-CNN)模型几乎具有相同的精度:PASCAL视觉对象上的Faster R-CNN的97.8%类别(PASCAL VOC)评估指标,SSD盗版占97.42%。就FER而言,ResNet获得了最高的训练准确度(90.14%),而视觉几何组(VGG)网络的准确度为87%,而Inception V3则达到了81%。结果表明,当使用两个序列化CNN而不是仅使用FER CNN时,改进了10%以上,而最新的优化模型称为整流自适应矩优化(RAdam),可以使泛化效果更好,精度提高3%-4%在每个情感识别CNN上。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号