Scene Flow to Action Map: A New Representation for RGB-D Based Action Recognition with Convolutional Neural Networks

机译：场景流到动作图：基于卷积神经网络的基于RGB-D的动作识别的新表示

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Scene flow describes the motion of 3D objects in real world and potentially could be the basis of a good feature for 3D action recognition. However, its use for action recognition, especially in the context of convolutional neural networks (ConvNets), has not been previously studied. In this paper, we propose the extraction and use of scene flow for action recognition from RGB-D data. Previous works have considered the depth and RGB modalities as separate channels and extract features for later fusion. We take a different approach and consider the modalities as one entity, thus allowing feature extraction for action recognition at the beginning. Two key questions about the use of scene flow for action recognition are addressed: how to organize the scene flow vectors and how to represent the long term dynamics of videos based on scene flow. In order to calculate the scene flow correctly on the available datasets, we propose an effective self-calibration method to align the RGB and depth data spatially without knowledge of the camera parameters. Based on the scene flow vectors, we propose a new representation, namely, Scene Flow to Action Map (SFAM), that describes several long term spatio-temporal dynamics for action recognition. We adopt a channel transform kernel to transform the scene flow vectors to an optimal color space analogous to RGB. This transformation takes better advantage of the trained ConvNets models over ImageNet. Experimental results indicate that this new representation can surpass the performance of state-of-the-art methods on two large public datasets.

机译：场景流描述了现实世界中3D对象的运动，并且可能是3D动作识别的良好功能的基础。但是，以前尚未研究将其用于动作识别，尤其是在卷积神经网络（ConvNets）的情况下。在本文中，我们提出了从RGB-D数据中提取和使用场景流进行动作识别的方法。以前的工作已将深度和RGB模式视为单独的通道，并提取了特征以供以后融合。我们采用不同的方法，并将这些模式视为一个实体，因此从一开始就允许特征提取以进行动作识别。解决了有关使用场景流进行动作识别的两个关键问题：如何组织场景流向量以及如何基于场景流表示视频的长期动态。为了在可用数据集上正确计算场景流，我们提出了一种有效的自校准方法，可以在不了解相机参数的情况下在空间上对齐RGB和深度数据。基于场景流向量，我们提出了一种新的表示形式，即“场景流到动作图”（SFAM），它描述了一些用于动作识别的长期时空动态。我们采用通道变换内核将场景流矢量变换为类似于RGB的最佳色彩空间。与ImageNet相比，这种转换更好地利用了经过训练的ConvNets模型。实验结果表明，这种新的表示形式可以在两个大型公共数据集上超越最新技术的性能。

著录项

来源
《IEEE Conference on Computer Vision and Pattern Recognition》|2017年|416-425|共10页
会议地点
作者
Pichao Wang; Wanqing Li; Zhimin Gao; Yuyao Zhang; Chang Tang; Philip Ogunbona;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Feature extraction; Three-dimensional displays; Videos; Cameras; Transforms; Optical imaging; Kernel;

机译：特征提取;三维显示;视频;相机;变换;光学成像;内核;

相似文献

外文文献
中文文献
专利

1. First-person activity recognition from micro-action representations using convolutional neural networks and object flow histograms [J] . Giannakeris Panagiotis, Petrantonakis Panagiotis C., Avgerinakis Konstantinos, Multimedia Tools and Applications . 2021,第15期

机译：使用卷积神经网络和物体流动直方图的微动态表示的第一人称活动识别
2. Action representation and recognition through temporal co-occurrence of flow fields and convolutional neural networks [J] . Hatem A. Rashwan, Miguel Angel Garcia, Saddam Abdulwahab, Multimedia Tools and Applications . 2020,第45a46期

机译：通过流场和卷积神经网络的时间共同发生行动表示和识别
3. Action recognition based on joint trajectory maps with convolutional neural networks [J] . Pichao Wang, Wanqing Li, Chuankun Li, Knowledge-Based Systems . 2018,第OCTa15期

机译：基于卷积神经网络的联合轨迹图的动作识别
4. Scene Flow to Action Map: A New Representation for RGB-D Based Action Recognition with Convolutional Neural Networks [C] . Pichao Wang, Wanqing Li, Zhimin Gao, IEEE Conference on Computer Vision and Pattern Recognition . 2017

机译：场景流到行动地图：基于RGB-D的动作识别与卷积神经网络的新表示
5. Improving Facial Action Unit Recognition Using Convolutional Neural Networks [D] . Han, Shizhong. 2017

机译：使用卷积神经网络改善面部动作单元识别
6. Skeleton Driven Action Recognition Using an Image-Based Spatial-Temporal Representation and Convolution Neural Network [O] . Vinícius Silva, Filomena Soares, Celina P. Leão, 2021

机译：骨架驱动动作识别使用基于图像的空间时间表示和卷积神经网络
7. Scene Flow to Action Map: A New Representation for RGB-D based Action Recognition with Convolutional Neural Networks [O] . Wang, Pichao, Li, Wanqing, Gao, Zhimin, 2017

机译：场景流向动作图：基于RGB-D的动作的新表示用卷积神经网络识别

Scene Flow to Action Map: A New Representation for RGB-D Based Action Recognition with Convolutional Neural Networks

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅