首页> 外文会议>Construction Research Congress >Estimating the Visual Attention of Construction Workers from Head Pose Using Convolutional Neural Network-Based Multi-Task Learning
【24h】

Estimating the Visual Attention of Construction Workers from Head Pose Using Convolutional Neural Network-Based Multi-Task Learning

机译:基于卷积神经网络的多任务学习估算头部姿势的建筑工人的视觉注意力

获取原文

摘要

The visual attention of construction workers is an important indicator to assess their situational awareness and infer their intention for reducing construction injuries and improving construction site safety. The eye-tracking technology has been adopted in several studies to directly measure the gaze direction and determine workers' visual attention. However, eye-trackers are expensive and wearing them may disturb normal operations. Considering the increasing use of surveillance videos and the availability of construction images, it is of great potential to estimate workers' visual attention from imagery data, which, however, has not been well exploited by existing studies. This paper presents a convolutional neural network (CNN)-based multi-task learning framework to estimate the visual attention of construction workers from head pose using low-resolution images. Visual attention is approximated by head yaw and pitch orientation. The problem is formulated as a multi-task image classification problem, where the first task is head yaw classification, and the second task is head pitch classification. A CNN-based multi-task learning framework is designed to jointly learn two tasks, with shared layers capturing the commonality between tasks, and task-specific layers modeling the uniqueness of individual tasks. Compared to traditional single-task learning mechanism that trains different classifiers for each task, the proposed approach leverages the commonality of relevant tasks and captures the shared representation, which can significantly improve the efficiency and performance. The results suggest the proposed multi-learning framework can achieve an accuracy of 76.5% for head yaw estimation and 88.7% for head pitch estimation, better than the performance obtained using conventional single task learning.
机译:建筑工人的视觉关注是评估其情境意识,并推断出旨在减少建设伤害和提高施工现场安全的重要指标。在几项研究中采用了引人注目的技术,直接测量凝视方向并确定工人的视觉关注。然而,眼跟踪器昂贵并且佩戴它们可能会扰乱正常操作。考虑到越来越多地利用监控视频和施工图像的可用性,估计工人的视觉注意力来自图像数据的可能性很大,但是,现有研究没有得到很好的利用。本文提出了一种卷积神经网络(CNN)的多任务学习框架,以估计使用低分辨率图像从头部姿势施工工人的视觉注意。视觉注意力由头部偏航和俯仰方向近似。该问题被标记为多任务图像分类问题,其中第一个任务是头部偏航分类,第二个任务是头部间距分类。基于CNN的多任务学习框架旨在共同学习两个任务,共享层捕获任务之间共性,以及建模各个任务的唯一性的特定于特定的层。与传统的单任务学习机制相比,为每个任务列达不同分类器的传统单任务学习机制,所提出的方法利用相关任务的共同性并捕获共享表示,可以显着提高效率和性能。结果表明,拟议的多学习框架可以实现66.5%的头部偏航估计的准确性,头部间距估计的88.7%,比使用传统单一任务学习获得的性能更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号