首页> 外文期刊>Neurocomputing >Rethinking the ST-GCNs for 3D skeleton-based human action recognition
【24h】

Rethinking the ST-GCNs for 3D skeleton-based human action recognition

机译:重新思考基于3D骨架的人类行动识别的ST-GCNS

获取原文
获取原文并翻译 | 示例

摘要

The skeletal data has been an alternative for the human action recognition task as it provides more com-pact and distinct information compared to the traditional RGB input. However, unlike the RGB input, the skeleton data lies in a non-Euclidean space that traditional deep learning methods are not able to use their fullest potential. Fortunately, with the emerging trend of Geometric deep learning, the spatial -temporal graph convolutional network (ST-GCN) has been proposed to deal with the action recognition problem from skeleton data. ST-GCN and its variants fit well with skeleton-based action recognition and are becoming the mainstream frameworks for this task. However, the efficiency and the performance of the task are hindered by either fixing the skeleton joint correlations or providing a computational expen-sive strategy to construct a dynamic topology for the skeleton. We argue that many of these operations are either unnecessary or even harmful for the task. By theoretically and experimentally analysing the state-of-the-art ST-GCNs, we provide a simple but efficient strategy to capture the global graph correla-tions and thus efficiently model the representation of the input graph sequences. Moreover, the global graph strategy also reduces the graph sequence into the Euclidean space, thus a multi-scale temporal fil-ter is introduced to efficiently capture the dynamic information. With the method, we are not only able to better extract the graph correlations with much fewer parameters (only 12.6% of the current best), but we also achieve a superior performance. Extensive experiments on current largest 3D datasets, NTU-RGB+D and NTU-RGB+D 120, demonstrate the ability of our network to perform efficient and lightweight priority on this task. (c) 2021 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http:// creativecommons.org/licenses/by/4.0/).
机译:与传统的RGB输入相比,骨架数据一直是人类行动识别任务的替代方案,因为它提供了更多的Com-Pact和不同的信息。然而,与RGB输入不同,骨架数据位于非欧几里德空间中,传统的深度学习方法无法使用最充分的潜力。幸运的是,随着几何深度学习的新兴趋势,已经提出了空间图形卷积网络(ST-GCN)来处理来自骨架数据的动作识别问题。 ST-GCN及其变体与基于骨架的动作识别良好,正在成为这项任务的主流框架。然而,通过固定骨架关节相关或提供计算的净化策略来阻碍任务的效率和性能,以构建骨骼的动态拓扑。我们认为,许多这些操作对于任务来说是不必要的甚至有害的。在理论上和实验地分析最先进的ST-GCNS,我们提供了一种简单但有效的策略来捕获全局图形相关性,从而有效地模拟输入图序列的表示。此外,全局图策略还将图形序列降低到欧几里德空间中,因此引入多尺度的时间FIL-TER以有效地捕获动态信息。通过该方法,我们不仅能够更好地提取图表相关性,参数更少(仅为12.6%最佳),但我们也达到了卓越的性能。对当前最大的3D数据集,NTU-RGB + D和NTU-RGB + D 120进行了广泛的实验,展示了我们网络在此任务上执行高效和轻量级优先级的能力。 (c)2021作者。由elsevier b.v发布。这是CC下的开放式访问文章,由许可证(http:// creativecommons.org/licenses/by/4.0/)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号