首页> 外文会议>International Symposium on Communications and Information Technologies >Unsupervised learning of space-time symmetric patterns in RGB-D videos for 4D human activity detection
【24h】

Unsupervised learning of space-time symmetric patterns in RGB-D videos for 4D human activity detection

机译:用于4D人类活动检测的RGB-D视频中的时空对称图案的无监督学习

获取原文

摘要

In this paper we present an approach for finding space-time activity map in a video shot using 3D moment methods. A RGB-D video involves a specific human activity is first regularly partitioned into multiple video shots in which human activities can be defined. For each video shot, we separate it into multiple video cubes which characterizes local object shape and motion. Given a local video cube, the proposed spacetime pattern detector extracts both the spatial and temporal symmetric information which are further grouped together by hashing to construct an activity map that describes the distribution of motion vectors of objects in a video shot. The intrinsic human activity in a video consisting of multiple shots is finally represented by a set of activity maps. Next, to reduce the temporal dimensionality of an activity in terms of activity maps, the kernel PCA method is applied to transform the activity representation into a set of principal activity maps. Finally, regardless of the activity types of the training videos, all the training principal activity maps are clustered into multiple clusters to generate a principal activity map dictionary. This dictionary is used to solve the initial pose problem when we use dynamic programming to align two sequences of principal activity maps for recognizing human activities in RGB-D videos. The proposed approach was tested using publicly available datasets. Experimental results demonstrate the good performance of the proposed method in terms of activity detection accuracy and execution speed.
机译:在本文中,我们提出了一种使用3D矩方法在视频镜头中查找时空活动图的方法。 RGB-D视频涉及特定的人类活动,首先将其定期分为多个可以定义人类活动的视频镜头。对于每个视频镜头,我们将其分为多个视频立方体,这些立方体描述了局部对象的形状和运动。给定一个本地视频立方体,建议的时空模式检测器提取空间和时间对称信息,通过哈希将它们进一步分组在一起,以构造一个活动图,该活动图描述视频镜头中对象运动矢量的分布。包含多个镜头的视频中固有的人类活动最终由一组活动图表示。接下来,为了根据活动图减少活动的时间维,应用内核PCA方法将活动表示转换为一组主要活动图。最后,无论训练视频的活动类型如何,所有训练主要活动图都聚类为多个聚类以生成主要活动图字典。当我们使用动态编程来对齐两个主要活动图序列以识别RGB-D视频中的人类活动时,该词典用于解决初始姿势问题。使用公开可用的数据集对提出的方法进行了测试。实验结果证明了该方法在活动检测精度和执行速度方面的良好性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号