首页> 外文期刊>Journal of vision >Pictorial Human Spaces: How Well do Humans Perceive a 3D Articulated Pose?
【24h】

Pictorial Human Spaces: How Well do Humans Perceive a 3D Articulated Pose?

机译:图画人类空间:人类对3D关节姿势的感觉如何?

获取原文
       

摘要

When shown a photograph of a person, humans have a vivid, immediate sense of 3D pose awareness and a rapid understanding of the subtle body language, personal attributes, or intentionality of that person. How can this happen and what do humans perceive? How accurate are they? Our aim is to unveil the process and level of accuracy involved in 3D perception of people from images by assessing the human performance. Our approach to establishing an observation-perception link is to make humans re-enact the 3D pose of another person (for which ground truth is available), shown in a photograph, following a short exposure time of 5 seconds. Our apparatus simultaneously captures human pose and eye movements during the pose re-enacting performance. In the process of perceiving and reproducing the pose, subjects attend firstly upper body joints with a general trend of focusing more on extremities than internal joints. Although the resulting scanpaths are pose-dependent, they are quite stable across subjects both spatially and sequentially. Our study reveals that people are not significantly better at re-enacting 3D poses given visual stimuli, on average, than existing computer vision algorithms. Errors in the order of 10?°-20?° or 100mm per 3D body joint position are not uncommon. The contribution of our work can be summarized as follows: (1) the construction of an apparatus relating the human visual perception with 3D ground truth; (2) the creation of a dataset (publicly available) collected from 10 subjects, containing 120 images of humans in different poses, both easy and difficult, and (3) quantitative analysis of human eye movements, 3D pose reenactment performance, error levels, stability, correlation as well as cross-stimulus control, in order to reveal how different 3D configurations relate to the subject focus on certain features in images, in the context of the given task.
机译:当显示某人的照片时,人类会立即感受到3D姿势的生动感,并快速了解该人的微妙肢体语言,个人属性或意图。这如何发生?人类会如何看待?它们有多精确?我们的目标是通过评估人员的表现来揭示图像对人的3D感知所涉及的过程和准确性水平。我们建立观察-感知链接的方法是在5秒钟的短时间曝光后,使人类重现照片中显示的另一个人的3D姿势(可提供地面真相)。在重演姿势时,我们的设备会同时捕获人的姿势和眼睛的动作。在感知和再现姿势的过程中,对象首先参加上半身关节,其总体趋势是更关注四肢而不是内部关节。尽管生成的扫描路径与姿势有关,但它们在空间和顺序上在对象之间都非常稳定。我们的研究表明,与现有的计算机视觉算法相比,在给定视觉刺激下,人们在重现3D姿势方面平均没有明显改善。每个3D人体关节位置的误差通常不超过10°-20°或100mm。我们的工作可以概括如下:(1)构造将人类视觉与3D地面真实性联系起来的设备; (2)创建一个从10个对象中收集的数据集(公开可用),其中包含120张不同姿势的人体图像(包括容易的和困难的),以及(3)定量分析人眼的运动,3D姿势重演性能,错误级别,稳定性,相关性以及交叉刺激控制,以揭示在给定任务的背景下,不同的3D配置与主题之间的关系如何集中于图像中的某些特征。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号