首页> 外文会议>International conference on pattern recognition and machine intelligence >Data Driven Sensing for Action Recognition Using Deep Convolutional Neural Networks
【24h】

Data Driven Sensing for Action Recognition Using Deep Convolutional Neural Networks

机译:使用深度卷积神经网络进行动作识别的数据驱动传感

获取原文

摘要

Tasks such as action recognition requires high quality features for accurate inference. But the use of high resolution and large volume of video data poses a significant challenge for inference in terms of storage and computational complexity. In addition, compressive sensing as a potential solution to the aforementioned problems has been shown to recover signals at higher compression ratios with loss in information. Hence, a framework is required that performs good quality action recognition on compressively sensed data. In this paper, we present data-driven sensing for spatial multiplexers trained with combined mean square error (MSE) and perceptual loss using Deep convolutional neural networks. We employ subpixel convolutional layers with the 2D Convolutional Encoder-Decoder model, that learns the downscaling filters to bring the input from higher dimension to lower dimension in encoder and learns the reverse, i.e. upscaling filters in the decoder. We stack this Encoder with Inflated 3D ConvNet and train the cascaded network with cross-entropy loss for Action recognition. After encoding data and undersampling it by over 100 times (10 × 10) from the input size, we obtain 75.05% accuracy on UCF-101 and 50.39% accuracy on HMDB-51 with our proposed architecture setting the baseline for reconstruction free action recognition with data-driven sensing using deep learning. We experimentally infer that the encoded information from such spatial multiplexers can directly be used for action recognition.
机译:诸如动​​作识别之类的任务需要高质量的功能才能进行准确的推断。但是,在存储和计算复杂性方面,高分辨率和大量视频数据的使用对推理提出了重大挑战。另外,已经显示出压缩感测作为上述问题的潜在解决方案,以较高的压缩比恢复信号而损失了信息。因此,需要一种对压缩感测的数据执行高质量动作识别的框架。在本文中,我们介绍了使用深度卷积神经网络结合均方误差(MSE)和感知损失训练的空间多路复用器的数据驱动传感。我们将子像素卷积层与2D卷积编码器/解码器模型一起使用,该模型学习降级滤波器以将输入从编码器中的高维转换为低维,并学习反方向,即在解码器中按比例放大滤波器。我们将此编码器与Inflated 3D ConvNet堆叠在一起,并训练具有交叉熵损失的级联网络以进行动作识别。对数据进行编码并从输入大小中对其进行欠采样100倍以上(10×10)后,我们提出的架构为UCF-101的重构自由动作识别基线设置了75.05%的精度,对HMDB-51的精度达到了50.39%。使用深度学习的数据驱动感测。我们通过实验推断出,来自此类空间多路复用器的编码信息可以直接用于动作识别。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号