Multi-level Three-Stream Convolutional Networks for Video-Based Action Recognition

机译：基于视频的动作识别的多层三流卷积网络

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Deep convolutional neural networks (ConvNets) have shown remarkable capability for visual feature learning and representation. In the field of video-based action recognition, much progress has been made with the development of ConvNets. However, main-stream ConvNets used for video-based action recognition, such as two-stream ConvNets and 3D ConvNets, still lack the ability to represent fine-grained features. In this paper, we propose a novel architecture named multi-level three-stream convolutional network (MLTSN), which contains three streams, i.e., the spatial stream, the temporal stream, and the multi-level correlation stream (MLCS). The MLCS contains several correlation modules, which fuse appearance and motion features at the same levels and obtain spatial-temporal correlation maps. The correlation maps will further be fed in several convolution layers to get refined features. The whole network is trained in a multi-step modality. Extensive experimental results show that the performance of the proposed network is competitive to state-of-the-art methods on HMDB51 and UCF101.

机译：深度卷积神经网络（ConvNets）已显示出显着的视觉特征学习和表示能力。在基于视频的动作识别领域，随着ConvNets的发展，已经取得了很大的进步。但是，用于基于视频的动作识别的主流ConvNet，例如两流ConvNet和3D ConvNet，仍然缺乏表示细粒度功能的能力。在本文中，我们提出了一种新颖的架构，称为多级三流卷积网络（MLTSN），其中包含三个流，即空间流，时间流和多级相关流（MLCS）。 MLCS包含几个相关模块，这些模块将外观和运动特征融合在同一级别，并获得时空相关图。相关图将进一步馈入几个卷积层中，以得到精炼的特征。整个网络以多步骤的方式进行训练。大量的实验结果表明，所提出的网络的性能与HMDB51和UCF101上的最新方法相比具有竞争优势。

著录项

来源
《Chinese conference pattern recognition and computer vision》|2018年|237-249|共13页
会议地点
作者
Yijing Lv; Huicheng Zheng; Wei Zhang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Action recognition; Convolutional networks Multi-level correlation mechanism;

机译：动作识别;卷积网络多级关联机制;

相似文献

外文文献
中文文献
专利

1. Video-Based Human Action Recognition Using Spatial Pyramid Pooling and 3D Densely Convolutional Networks [J] . Wanli Yang, Yimin Chen, Chen Huang, Future Internet . 2018,第12期

机译：使用空间金字塔池和3D密集卷积网络的基于视频的人类动作识别
2. NIRExpNet: Three-Stream 3D Convolutional Neural Network for Near Infrared Facial Expression Recognition [J] . Zhihao Zhang, Guangyuan Liu, Zhan Wu, Applied Sciences . 2017,第11期

机译：NIRExpNet：用于近红外面部表情识别的三流3D卷积神经网络
3. NIRExpNet: Three-Stream 3D Convolutional Neural Network for Near Infrared Facial Expression Recognition [J] . Zhihao Zhang, Guangyuan Liu, Zhan Wu, Applied Sciences . 2017,第11期

机译：nirexpnet：三流3d卷积神经网络，用于近红外面部表情识别
4. Multi-level Three-Stream Convolutional Networks for Video-Based Action Recognition [C] . Yijing Lv, Huicheng Zheng, Wei Zhang Chinese Conference on Pattern Recognition and Computer Vision . 2018

机译：基于视频动作识别的多级三流卷积网络
5. Improving Facial Action Unit Recognition Using Convolutional Neural Networks [D] . Han, Shizhong. 2017

机译：使用卷积神经网络改善面部动作单元识别
6. Zero-Shot Action Recognition with Three-Stream Graph Convolutional Networks [O] . Nan Wu, Kazuhiko Kawamoto 2021

机译：用三流图卷积网络零拍摄动作识别
7. Video-Based Human Action Recognition Using Spatial Pyramid Pooling and 3D Densely Convolutional Networks [O] . Wanli Yang, Yimin Chen, Chen Huang, 2018

机译：基于视频的人类行动识别使用空间金字塔池和3D密集卷积网络

Multi-level Three-Stream Convolutional Networks for Video-Based Action Recognition

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅