首页> 外文期刊>IEEE Transactions on Image Processing >Representation Learning of Temporal Dynamics for Skeleton-Based Action Recognition
【24h】

Representation Learning of Temporal Dynamics for Skeleton-Based Action Recognition

机译:基于骨架的动作识别的时间动力学表示学习

获取原文
获取原文并翻译 | 示例
       

摘要

Motion characteristics of human actions can be represented by the position variation of skeleton joints. Traditional approaches generally extract the spatial–temporal representation of the skeleton sequences with well-designed hand-crafted features. In this paper, in order to recognize actions according to the relative motion between the limbs and the trunk, we propose an end-to-end hierarchical RNN for skeleton-based action recognition. We divide human skeleton into five main parts in terms of the human physical structure, and then feed them to five independent subnets for local feature extraction. After the following hierarchical feature fusion and extraction from local to global, dimensions of the final temporal dynamics representations are reduced to the same number of action categories in the corresponding data set through a single-layer perceptron. In addition, the output of the perceptron is temporally accumulated as the input of a softmax layer for classification. Random scale and rotation transformations are employed to improve the robustness during training. We compare with five other deep RNN variants derived from our model in order to verify the effectiveness of the proposed network. In addition, we compare with several other methods on motion capture and Kinect data sets. Furthermore, we evaluate the robustness of our model trained with random scale and rotation transformations for a multiview problem. Experimental results demonstrate that our model achieves the state-of-the-art performance with high computational efficiency.
机译:人类动作的运动特征可以通过骨骼关节的位置变化来表示。传统方法通常使用精心设计的手工特征来提取骨架序列的时空表示。在本文中,为了根据肢体和躯干之间的相对运动来识别动作,我们提出了一种基于骨骼的动作识别的端到端分层RNN。根据人体的物理结构,我们将人体骨骼分为五个主要部分,然后将它们馈入五个独立的子网以进行局部特征提取。在以下层次结构特征融合和从局部到全局的提取之后,最终的时态动力学表示的维数通过单层感知器减少为相应数据集中相同数量的动作类别。另外,感知器的输出在时间上被累积为用于分类的softmax层的输入。随机缩放和旋转变换可用于提高训练期间的鲁棒性。我们将其他五个深RNN变体从我们的模型中进行比较,以验证所提出网络的有效性。此外,我们在运动捕捉和Kinect数据集上与其他几种方法进行了比较。此外,我们评估了针对多视图问题采用随机比例尺和旋转变换训练的模型的鲁棒性。实验结果表明,我们的模型以最高的计算效率实现了最先进的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号