首页> 外文期刊>Future generation computer systems >Massive-scale complicated human action recognition: Theory and applications
【24h】

Massive-scale complicated human action recognition: Theory and applications

机译:大规模复杂的人类行动认可:理论与应用

获取原文
获取原文并翻译 | 示例
           

摘要

Recognizing various human actions in video is a challenging task, which is also one of the key tasks in computer vision. It has received extensive attention from AI researchers Bakker et al. (2003), Bruderlin and Williams (1995), Cardie et al. (2003), Carlsson (1996, 1999), Clausen and Kurth (2004). It has important applications in human behavior analysis, artificial intelligence, and video surveillance. Compared with still image classification, the time component of video provides important clues for recognition, so multiple human actions can be recognized based on motion information. In addition, video provides natural data enhancement for individual images. For motion recognition from videos, appearance and temporal dynamics are two key and complementary cues. In this work, we formulate a human motion recognition framework based on 2D spatial feature fusion of Kinect bone data. This method utilizes the human body structure and spatial geometry to represent human body structure and extract features. Meanwhile, it combines the active action and auxiliary action features of two spatial dimensions by layering, and leverages the pervasively used support vector machine and hidden Markov model to classify human body structure. Noticeably, it is difficult to extract its information due to the limitations of background clutter, viewpoint change, scale change, different lighting conditions and camera movement. In this way, while learning the classification information of human behavioral categories, designing effective representations is the key to deal with these challenges. In this work, a method for human motion recognition in video based on ResNext network is proposed. Based on our proposed ResNeXt network, using the data of RGB and optical flow, we can extract more appearance characteristics and temporal features of human actions, so as to better achieve the classification of actions. The utilization of video time segmentation method can capture the long range of time in video, in order to make better use of the longer range of time information in video. Comprehensive experimental results have shown that the proposed method improves the performance of UCF101 and HMDB51 motion recognition data sets to a certain extent.
机译:识别视频中的各种人类行为是一个具有挑战性的任务,也是计算机视觉中的关键任务之一。它从AI研究人员Bakker等人获得了广泛的关注。 (2003),Bruderlin和Williams(1995),Cardie等。 (2003),Carlsson(1996,1999),Clausen和Kurth(2004年)。它具有人类行为分析,人工智能和视频监控的重要应用。与静止图像分类相比,视频的时间分量提供了重要的线索来识别,因此可以基于运动信息来识别多个人类动作。此外,视频还为单个图像提供自然数据增强。对于来自视频的运动识别,外观和时间动态是两个密钥和互补线索。在这项工作中,我们基于Kinect Bone数据的2D空间特征融合来制定人类运动识别框架。该方法利用人体结构和空间几何形状来代表人体结构和提取特征。同时,它通过分层结合了两个空间尺寸的主动动作和辅助动作特征,利用普遍使用的支持向量机和隐马尔可夫模型来分类人体结构。明显的是,由于背景杂波的局限,观点变化,缩放变化,不同的照明条件和相机运动,难以提取其信息。通过这种方式,在学习人类行为类别的分类信息的同时,设计有效的表示是处理这些挑战的关键。在这项工作中,提出了一种基于ResNext网络的视频中的人类运动识别方法。基于我们所提出的RENEXT网络,使用RGB和光学流量的数据,我们可以提取更多的人类动作的外观特征和时间特征,从而更好地实现动作的分类。视频时间分割方法的利用可以捕获视频中的长期时间,以便更好地利用视频中的较长时间信息。综合实验结果表明,所提出的方法改善了UCF101和HMDB51运动识别数据集的性能在一定程度上。

著录项

  • 来源
    《Future generation computer systems》 |2021年第12期|806-811|共6页
  • 作者单位

    Business school Xiamen Institute of Technology Xiamen 361000 Fujian China;

    Center for Faculty Development Fuzhou University of International Studies and Trade Fuzhou 350000 Fujian China;

    School of Economics Fujian Normal University Fuzhou 350007 Fujian China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Robot perception management; VQA; 3D CNN; Video-related text; Bi-LSTM;

    机译:机器人感知管理;VQA;3D CNN;视频相关文本;bi-lstm.;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号