Deep Neural Networks Using Capsule Networks and Skeleton-Based Attentions for Action Recognition

Manh-Hung Ha; Oscal Tzyh-Chiang Chen

首页> 外文期刊>Quality Control, Transactions >Deep Neural Networks Using Capsule Networks and Skeleton-Based Attentions for Action Recognition

【24h】

Deep Neural Networks Using Capsule Networks and Skeleton-Based Attentions for Action Recognition

机译：深度神经网络，使用胶囊网络和基于骨架的行动识别的关注

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This work develops Deep Neural Networks (DNNs) by adopting Capsule Networks (CapsNets) and spatiotemporal skeleton-based attention to effectively recognize subject actions from abundant spatial and temporal contexts of videos. The proposed generic DNN includes four 3D Convolutional Neural Networks (3D_CNNs), Attention-Jointed Appearance (AJA) and Attention-Jointed Motion (AJM) generation layers, two Reduction Layers (RLs), two Attention-based Recurrent Neural Networks (A_RNNs), and an inference classifier, where RGB, transformed skeleton, and optical-flow channel streams are inputs. The AJA and AJM generation layers emphasize skeletons to the appearances and motions of a subject, respectively. A_RNNs generate attention weights over time steps to highlight rich temporal contexts. To integrate CapsNets in this generic DNN, three types of CapsNet-based DNNs are devised, where the CapsNets take over a classifier, A_RNN+classifier, and RL+A_RNN+classifier. The experimental results reveal that the proposed DNN using CapsNet as an inference classifier outperforms the other two CapsNet-based DNNs and the generic DNN adopting the feedforward neural network as an inference classifier. Additionally, our best CapsNet-based DNN achieves average accuracies of 98.5% for the state-of-the-art performance in UCF101, 82.1% for near-state-of-the-art performance in HMDB51, and 95.3% for panoramic videos, to the best of our knowledge. Particularly, it is determined that the generic CapsNet behaves as an outstanding inference classifier but is slightly worse than the A_RNN in interpreting temporal evidence for recognition. Therefore, the proposed DNN, which employs CapsNet to fulfill an inference classifier, can be superiorly applied to various context-aware visual applications.

机译：这项工作通过采用胶囊网络（Capsnets）和基于时空骨架的注意力来实现深度神经网络（DNN），从而有效地识别来自视频的丰富空间和时间上下文的主题行动。所提出的通用DNN包括四个3D卷积神经网络（3D_CNNS），关注关节外观（AJA）和注意力接合运动（AJM）生成层，两个减少层（RLS），基于关注的经常性神经网络（A_RNNS），和推断分类器，其中RGB，变换的骨架和光流信道流是输入的。 AJA和AJM生成层分别强调骷髅分别对受试者的外表和运动。 A_RNNS会随着时间的步骤产生注意力，以突出显示丰富的时间上下文。要将CapSnet集成到此通用DNN中，设计了三种类型的基于帽的DNN，其中封装盒占用分类器，A_RNN +分类器和RL + A_RNN +分类器。实验结果表明，所提出的DNN使用帽作为推理分类器优于其他两个基于帽的DNN和采用前馈神经网络的通用DNN作为推理分类器。此外，我们最佳的基于CAPSNET的DNN在UCF101中的最先进性能下实现了98.5％的平均精度，在HMDB51中近最先进的性能，82.1％，全景视频为95.3％，据我们所知。特别地，确定通用帽表现为出色的推理分类器，但略差于解释识别的时间证据中的A_RNN。因此，采用CAPSNet符合推理分类器的提议DNN可以优于各种上下文感知视觉应用程序。

著录项

来源
《Quality Control, Transactions》 |2021年第1期|6164-6178|共15页
作者
Manh-Hung Ha; Oscal Tzyh-Chiang Chen;
展开▼
作者单位

National Chung Cheng University Chiayi Taiwan;

National Chung Cheng University Chiayi Taiwan;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Skeleton; Videos; Three-dimensional displays; Optical network units; Optical imaging; Feature extraction; Recurrent neural networks;

机译：骨架;视频;三维显示器;光网络单元;光学成像;特征提取;经常性神经网络;

相似文献

外文文献
中文文献
专利

1. Deep-Aligned Convolutional Neural Network for Skeleton-Based Action Recognition and Segmentation [J] . Babak Hosseini, Romain Montagne, Barbara Hammer Data Science and Engineering . 2020,第2期

机译：基于骨架的动作识别和分割的深对齐卷积神经网络
2. Attention adjacency matrix based graph convolutional networks for skeleton-based action recognition [J] . Xie Jun, Miao Qiguang, Liu Ruyi, Neurocomputing . 2021,第Juna14期

机译：基于骨架动作识别的关注基于邻接矩阵的图表卷积网络
3. Spatial–temporal graph attention networks for skeleton-based action recognition [J] . Huang Qingqing, Zhou Fengyu, He Jiakai, Journal of electronic imaging . 2020,第5期

机译：基于骨架的动作识别的空间颞曲线图注意网络
4. Deep Stacked Bidirectional LSTM Neural Network for Skeleton-Based Action Recognition [C] . Kai Zou, Ming Yin, Weitian Huang, International Conference on Image and Graphics . 2019

机译：基于骨架动作识别的深层堆叠双向LSTM神经网络
5. Action Recognition from Videos using Deep Neural Networks. [D] . Ghewari, Rishikesh Sanjay. 2017

机译：使用深度神经网络从视频中识别动作。
6. Comparison between Recurrent Networks and Temporal Convolutional Networks Approaches for Skeleton-Based Action Recognition [O] . Mihai Nan, Mihai Trăscău, Adina Magda Florea, 2021

机译：基于骨架动作识别的经常性网络与时间卷积网络方法的比较
7. Deep Neural Networks Using Capsule Networks and Skeleton-Based Attentions for Action Recognition [O] . Manh-Hung Ha, Oscal Tzyh-Chiang Chen 2021

机译：深度神经网络，使用胶囊网络和基于骨架的行动识别的关注

Deep Neural Networks Using Capsule Networks and Skeleton-Based Attentions for Action Recognition

摘要

著录项

相似文献

相关主题

期刊订阅