首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Nose, Eyes and Ears: Head Pose Estimation by Locating Facial Keypoints
【24h】

Nose, Eyes and Ears: Head Pose Estimation by Locating Facial Keypoints

机译:鼻子,眼睛和耳朵:通过定位面部键点来姿势估计

获取原文

摘要

Monocular head pose estimation requires learning a model that computes the intrinsic Euler angles for pose (yaw, pitch, roll) from an input image of human face. Annotating ground truth head pose angles for images in the wild is difficult and requires ad-hoc fitting procedures (which provides only coarse and approximate annotations). This highlights the need for approaches which can train on data captured in controlled environment and generalize on the images in the wild (with varying appearance and illumination of the face). Most present day deep learning approaches which learn a regression function directly on the input images fail to do so. To this end, we propose to use a higher level representation to regress the head pose while using deep learning architectures. More specifically, we use the uncertainty maps in the form of 2D soft localization heatmap images over five facial key-points, namely left ear, right ear, left eye, right eye and nose, and pass them through an convolutional neural network to regress the head-pose. We show head pose estimation results on two challenging benchmarks BIWI and AFLW and our approach surpasses the state of the art on both the datasets.
机译:单眼姿势估计需要学习一种模型,该模型从人脸的输入图像计算姿势(偏航,俯仰,卷)的内在欧拉角。注释地面真理头部在野外图像的姿势角度很困难,需要拟合拟合程序(仅提供粗略和近似注释)。这突出了对可以在受控环境中捕获的数据训练的方法的需求,并在野外的图像上概括(具有不同的外观和面部照明)。大多数现在,直接在输入图像上学习回归函数的深度学习方法无法这样做。为此,我们建议在使用深度学习架构时使用更高的级别表示来回归头部姿势。更具体地说,我们使用2D软定位的形式的不确定性地图在五个面部键点上的图像,即左耳,右耳,左眼,右眼和鼻子,并通过卷积神经网络来回归头部姿势。我们展示了头部姿势估计结果,对两个具有挑战性的基准BIWI和AFLW,我们的方法在数据集中超越了最先进的状态。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号