首页> 外文期刊>Computer vision and image understanding >Hierarchical temporal graphical model for head pose estimation and subsequent attribute classification in real-world videos
【24h】

Hierarchical temporal graphical model for head pose estimation and subsequent attribute classification in real-world videos

机译:用于真实视频中头部姿势估计和后续属性分类的分层时间图形模型

获取原文
获取原文并翻译 | 示例

摘要

Recently, head pose estimation in real-world environments has been receiving attention in the computer vision community due to its applicability to a wide range of contexts. However, this task still remains as an open problem because of the challenges presented by real-world environments. The focus of most of the approaches to this problem has been on estimation from single images or video frames, without leveraging the temporal information available in the entire video sequence. Other approaches frame the problem in terms of classification into a set of very coarse pose bins. In this paper, we propose a hierarchical graphical model that probabilistically estimates continuous head pose angles from real-world videos, by leveraging the temporal pose information over frames. The proposed graphical model is a general framework, which is able to use any type of feature and can be adapted to any facial classification task. Furthermore, the framework outputs the entire pose distribution for a given video frame. This permits robust temporal probabilistic fusion of pose information over the video sequence, and also probabilistically embedding the head pose information into other inference tasks. Experiments on large, real-world video sequences reveal that our approach significantly outperforms alternative state-of-the-art pose estimation methods. The proposed framework is also evaluated on gender and facial hair estimation. By incorporating pose information into the proposed hierarchical temporal graphical mode, superior results are achieved for attribute classification tasks.
机译:最近,由于现实世界中的头部姿势估计适用于广泛的环境,因此在计算机视觉界已引起人们的关注。但是,由于现实环境带来的挑战,该任务仍然是一个悬而未决的问题。解决该问题的大多数方法的重点在于从单个图像或视频帧进行估计,而没有利用整个视频序列中可用的时间信息。其他方法将问题归类为一组非常粗糙的姿势箱。在本文中,我们提出了一种分层的图形模型,该模型可以通过利用帧上的时间姿势信息来从实际视频中概率估计连续的头部姿势角度。所提出的图形模型是一个通用框架,该框架能够使用任何类型的功能,并且可以适应任何面部分类任务。此外,框架输出给定视频帧的整个姿势分布。这允许在视频序列上对姿势信息进行鲁棒的时间概率融合,并且还可以将头部姿势信息概率性地嵌入到其他推理任务中。在大型,真实世界的视频序列上进行的实验表明,我们的方法大大优于替代的最新姿态估计方法。所提议的框架还评估了性别和面部毛发估计。通过将姿势信息合并到建议的分层时间图形模式中,可以为属性分类任务提供出色的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号