Three-stream Very Deep Neural Network for Video Action Recognition

机译：三流超深度神经网络的视频动作识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The purpose of this study is to determine whether fine-tuning very deep three-dimensional Convolutional Neural Network (3D CNN) that already pre-trained on an adequately large video dataset will give sufficient motion information for action recognition or still need to have supplementary information. We introduce a three-stream CNN that is based on two-dimensional (2D) and 3D kernels while leveraging successful pre-trained networks on ImageNet and Kinetics datasets. In order to analyze these streams, we fine-tune each on the HMDB-51 challenging dataset and show that supplementary motion information (optical flow and the proposed sparse trajectory image) are critical to action recognition despite using 3D CNN. Experimental outcomes determine that our network reaches 80.92% accuracy on the HMDB-51 dataset and its performance is comparable with the performance of state-of-the-art networks on this dataset.

机译：这项研究的目的是确定是否已经对足够大的视频数据集进行了预训练的非常深的三维卷积神经网络（3D CNN）的微调将提供足够的运动信息以进行动作识别还是仍然需要补充信息。我们介绍了一种基于二维（2D）和3D内核的三流CNN，同时利用了ImageNet和Kinetics数据集上成功的预训练网络。为了分析这些流，我们在HMDB-51具有挑战性的数据集上进行了微调，并表明尽管使用3D CNN，补充运动信息（光学流和提议的稀疏轨迹图像）对于动作识别也至关重要。实验结果确定了我们的网络在HMDB-51数据集上的准确性达到80.92％，其性能可与该数据集上的最新网络的性能相媲美。

著录项

来源
《International Conference on Pattern Recognition and Image Analysis》|2019年|80-86|共7页
会议地点
作者
Nasim Khani; Mehdi Rezaeian;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Three-dimensional displays; Streaming media; Optical flow; Trajectory; Training; Kinetic theory; Spatiotemporal phenomena;

机译：三维显示;流媒体;光流;轨迹;训练;运动学理论;时空现象;

相似文献

外文文献
中文文献
专利

1. Three-Stream Network With Bidirectional Self-Attention for Action Recognition in Extreme Low Resolution Videos [J] . Didik Purwanto, Rizard Renanda Adhi Pramono, Yie-Tarng Chen, IEEE signal processing letters . 2019,第8期

机译：具有双向自我关注能力的三流网络，用于超低分辨率视频中的动作识别
2. DEEP NEURAL NETWORKS FOR IRIS RECOGNITION SYSTEM BASED ON VIDEO: STACKED SPARSE AUTO ENCODERS (SSAE) AND BI-PROPAGATION NEURAL NETWORK MODELS [J] . ASAMA KUDER NSEAF, AZIZAH JAAFAR, KHIDER NASSIF JASSIM, Journal of Theoretical and Applied Information Technology . 2016,第2期

机译：基于视频的虹膜识别系统深层神经网络：堆叠稀疏自动编码器（SSAE）和双向传播神经网络模型
3. Hyperspectral and LiDAR Fusion Using Deep Three-Stream Convolutional Neural Networks [J] . Hao Li, Pedram Ghamisi, Uwe Soergel, Remote Sensing . 2018,第10期

机译：使用深三流卷积神经网络的高光谱和LiDAR融合
4. Three-stream Very Deep Neural Network for Video Action Recognition [C] . Nasim Khani, Mehdi Rezaeian International Conference on Pattern Recognition and Image Analysis . 2019

机译：用于视频动作识别的三流非常深的神经网络
5. Action Recognition from Videos using Deep Neural Networks. [D] . Ghewari, Rishikesh Sanjay. 2017

机译：使用深度神经网络从视频中识别动作。
6. Zero-Shot Action Recognition with Three-Stream Graph Convolutional Networks [O] . Nan Wu, Kazuhiko Kawamoto 2021

机译：用三流图卷积网络零拍摄动作识别
7. AdaScan: Adaptive Scan Pooling in Deep Convolutional Neural Networks for Human Action Recognition in Videos [O] . Kar, Amlan, Rai, Nishant, Sikka, Karan, 2017

机译：adascan：深度卷积神经网络中的自适应扫描池视频中的人类行为识别

Three-stream Very Deep Neural Network for Video Action Recognition

摘要

著录项

相似文献

相关主题

期刊订阅