首页> 外文期刊>Journal of Construction Engineering and Management >Multitask Learning Method for Detecting the Visual Focus of Attention of Construction Workers
【24h】

Multitask Learning Method for Detecting the Visual Focus of Attention of Construction Workers

机译:多任务学习方法,用于检测建筑工人的视觉焦点

获取原文
获取原文并翻译 | 示例
           

摘要

The visual focus of attention (VFOA) of construction workers is a critical cue for recognizing entity interactions, which in turn facilitates the interpretation of workers' intentions, the prediction of movements, and the comprehension of the jobsite context. The increasing use of construction surveillance cameras provides a cost-efficient way to estimate workers' VFOA from information-rich images. However, the low resolution of these images poses a great challenge to detecting the facial features and gaze directions. Recognizing that body and head orientations provide strong hints to infer workers' VFOA, this study proposes to represent the VFOA as a collection of body orientations, body poses, head yaws, and head pitches and designs a convolutional neural network (CNN)-based multitask learning (MTL) framework to automatically estimate workers' VFOA using low-resolution construction images. The framework is composed of two modules. In the first module, a Faster regional CNN (R-CNN) object detector is used to detect and extract workers' full-body images, and the resulting full-body images serve as a single input to the CNN-MTL model in the second module. In the second module, the VFOA estimation is formulated as a multitask image classification problem where four classification tasks-body orientation, body pose, head yaw, and head pitch-are jointly learned by the newly designed CNN-MTL model. Construction videos were used to train and test the proposed framework. The results show that the proposed CNN-MTL model achieves an accuracy of 0.91, 0.95, 0.86, and 0.83 in body orientation, body pose, head yaw, and head pitch classification, respectively. Compared with the conventional single-task learning, the MTL method reduces training time by almost 50% without compromising accuracy. (C) 2021 American Society of Civil Engineers.
机译:建筑工人的关注(VFOA)的视觉焦点是认识实体互动的重要提示,这反过来促进工人意图的解释,对工作的预测以及对工作站背景的理解。施工监控摄像机的日益增长的使用提供了一种具有成本效益的方式来估算来自信息丰富的图像的工人的VFOA。然而,这些图像的低分辨率对检测面部特征和凝视方向产生了巨大的挑战。认识到身体和头部方向为推断工人的VFOA提供了强烈的提示,本研究提出将VFOA代表为身体取向,身体姿势,头部偏航和头部间距的集合,并设计卷积神经网络(CNN)基础的多任务学习(MTL)框架使用低分辨率施工图像自动估算工人的VFOA。框架由两个模块组成。在第一模块中,使用更快的区域CNN(R-CNN)对象检测器来检测和提取工人的全身图像,并且由此产生的全身图像用作第二个对CNN-MTL模型的单个输入模块。在第二模块中,VFOA估计被制定为多任务图像分类问题,其中通过新设计的CNN-MTL模型共同学习四种分类任务 - 身体方向,身体姿势,头部偏航和头部间距。建筑视频用于培训和测试所提出的框架。结果表明,所提出的CNN-MTL模型分别实现了0.91,0.95,0.86和50.83的体内取向,身体姿势,头部偏航和头部间距分类的精度。与传统的单次任务学习相比,MTL方法将训练时间降低近50%而不会损害精度。 (c)2021年美国土木工程师协会。

著录项

  • 来源
    《Journal of Construction Engineering and Management》 |2021年第7期|04021063.1-04021063.13|共13页
  • 作者单位

    Univ Texas San Antonio Dept Construct Sci 501 W Cesar E Chavez Blvd San Antonio TX 78207 USA;

    Purdue Univ Lyles Sch Civil Engn 550 Stadium Mall Dr W Lafayette IN 47907 USA;

    Purdue Univ Lyles Sch Civil Engn 550 Stadium Mall Dr W Lafayette IN 47907 USA;

    Univ Tennessee Dept Civil & Environm Engn 851 Neyland Dr Knoxville TN 37996 USA;

    Purdue Univ Lyles Sch Civil Engn 550 Stadium Mall Dr W Lafayette IN 47907 USA;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号