A Closer Look at Spatiotemporal Convolutions for Action Recognition

机译：近距离观察时空卷积的动作识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper we discuss several forms of spatiotemporal convolutions for video analysis and study their effects on action recognition. Our motivation stems from the observation that 2D CNNs applied to individual frames of the video have remained solid performers in action recognition. In this work we empirically demonstrate the accuracy advantages of 3D CNNs over 2D CNNs within the framework of residual learning. Furthermore, we show that factorizing the 3D convolutional filters into separate spatial and temporal components yields significantly gains in accuracy. Our empirical study leads to the design of a new spatiotemporal convolutional block 'R(2+1)D' which produces CNNs that achieve results comparable or superior to the state-of-the-art on Sports-1M, Kinetics, UCF101, and HMDB51.

机译：在本文中，我们讨论了视频分析的几种时空卷积形式，并研究了它们对动作识别的影响。我们的动机来自于观察到，应用于视频各个帧的2D CNN在动作识别方面仍然表现出色。在这项工作中，我们通过经验证明了在残差学习框架内3D CNN相对于2D CNN的准确性优势。此外，我们表明将3D卷积滤波器分解为单独的空间和时间分量会显着提高准确性。我们的经验研究导致了新时空卷积模块'R（2 + 1）D'的设计，该模块产生的CNN的结果可与Sports-1M，Kinetics，UCF101和HMDB51。

著录项

来源
《IEEE/CVF Conference on Computer Vision and Pattern Recognition》|2018年|6450-6459|共10页
会议地点 Salt Lake City(US)
作者
Du Tran; Heng Wang; Lorenzo Torresani; Jamie Ray; Yann LeCun; Manohar Paluri;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Three-dimensional displays; Two dimensional displays; Spatiotemporal phenomena; Solid modeling; Feature extraction; Computer architecture;

机译：三维显示器；二维显示；时空现象；实体建模；特征提取;计算机架构;
入库时间 2022-08-26 14:35:28

相似文献

外文文献
中文文献
专利

1. Video spatiotemporal mapping for human action recognition by convolutional neural network [J] . Zare Amin, Abrishami Moghaddam Hamid, Sharifi Arash Pattern Analysis and Applications . 2020,第1期

机译：卷积神经网络用于人动作识别的视频时空映射
2. Emotion recognition from spatiotemporal EEG representations with hybrid convolutional recurrent neural networks via wearable multi-channel headset [J] . Computer Communications . 2020,第Mara期

机译：通过可穿戴多通道耳机通过混合卷积递归神经网络从时空EEG表示中识别情绪
3. Traffic Command Gesture Recognition for Virtual Urban Scenes Based on a Spatiotemporal Convolution Neural Network [J] . Chunyong Ma, Yu Zhang, Anni Wang, ISPRS International Journal of Geo-Information . 2018,第1期

机译：基于时空卷积神经网络的虚拟城市场景交通指挥手势识别
4. A Closer Look at Spatiotemporal Convolutions for Action Recognition [C] . Du Tran, Heng Wang, Lorenzo Torresani, IEEE/CVF Conference on Computer Vision and Pattern Recognition . 2018

机译：仔细看看行动识别的时空卷积
5. Improving Facial Action Unit Recognition Using Convolutional Neural Networks [D] . Han, Shizhong. 2017

机译：使用卷积神经网络改善面部动作单元识别
6. A Spatiotemporal Convolutional Network for Multi-Behavior Recognition of Pigs [O] . Dan Li, Kaifeng Zhang, Zhenbo Li, 2020

机译：时空卷积网络用于猪的多行为识别
7. Action Recognition Based on Two-Stream Convolutional Networks With Long-Short-Term Spatiotemporal Features [O] . Yanqin Wan, Zujun Yu, Yao Wang, 2020

机译：基于双流卷积网络的行动识别，具有长期短期的时空特征
8. Convolutional Architecture Exploration for Action Recognition and Image Classification. [R] . Turner, J. T., Aha, D., Smith, L., 2015

机译：动作识别和图像分类的卷积架构探索。

A Closer Look at Spatiotemporal Convolutions for Action Recognition

摘要

著录项

相似文献

相关主题

期刊订阅