首页> 外文期刊>Applied Sciences >Body-Part-Aware and Multitask-Aware Single-Image-Based Action Recognition
【24h】

Body-Part-Aware and Multitask-Aware Single-Image-Based Action Recognition

机译:Body-Part-Aware和基于多任务的单图像的动作识别

获取原文
           

摘要

Action recognition is an application that, ideally, requires real-time results. We focus on single-image-based action recognition instead of video-based because of improved speed and lower cost of computation. However, a single image contains limited information, which makes single-image-based action recognition a difficult problem. To get an accurate representation of action classes, we propose three feature-stream-based shallow sub-networks (image-based, attention-image-based, and part-image-based feature networks) on the deep pose estimation network in a multitasking manner. Moreover, we design the multitask-aware loss function, so that the proposed method can be adaptively trained with heterogeneous datasets where only human pose annotations or action labels are included (instead of both pose and action information), which makes it easier to apply the proposed approach to new data on behavioral analysis on intelligent systems. In our extensive experiments, we showed that these streams represent complementary information and, hence, the fused representation is robust in distinguishing diverse fine-grained action classes. Unlike other methods, the human pose information was trained using heterogeneous datasets in a multitasking manner; nevertheless, it achieved 91.91% mean average precision on the Stanford 40 Actions Dataset. Moreover, we demonstrated the proposed method can be flexibly applied to multi-labels action recognition problem on the V-COCO Dataset.
机译:行动识别是一个理想情况下的应用程序,理想情况下需要实时结果。我们专注于基于单像的动作识别而不是基于视频,因为提高了速度和更低的计算成本。然而,单个图像包含有限的信息,这使得基于单图像的动作识别难题。要获得Action类的准确表示,我们在多任务处理中提出了三个基于特征流的浅子网络(基于图像的基于图像,基于图像的基于图像和部分图像的特征网络)方式。此外,我们设计了多任务感知损失功能,使得所提出的方法可以自适应地用异构数据集自适应地培训,其中仅包括人类姿势注释或动作标签(而不是姿势和动作信息),这使得更容易应用提出了智能系统行为分析新数据的方法。在我们广泛的实验中,我们表明这些流代表了互补信息,因此,融合代表在区分不同的细粒度行动课程方面是强大的。与其他方法不同,人类姿势信息以多任务方式使用异构数据集进行训练;尽管如此,它在斯坦福40行动数据集上实现了91.91%的平均精度。此外,我们证明了所提出的方法可以灵活地应用于V-Coco数据集的多标签动作识别问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号