【24h】

I3D-LSTM: A New Model for Human Action Recognition

机译:I3D-LSTM:一种用于人类行动认可的新模式

获取原文

摘要

Action recognition has already been a heated research topic recently, which attempts to classify different human actions in videos. The current main-stream methods generally utilize ImageNet-pretrained model as features extractor, however it's not the optimal choice to pretrain a model for classifying videos on a huge still image dataset. What's more, very few works notice that 3D convolution neural network(3D CNN) is better for low-level spatial-temporal features extraction while recurrent neural network(RNN) is better for modelling high-level temporal feature sequences. Consequently, a novel model is proposed in our work to address the two problems mentioned above. First, we pretrain 3D CNN model on huge video action recognition dataset Kinetics to improve generality of the model. And then long short term memory(LSTM) is introduced to model the high-level temporal features produced by the Kinetics-pretrained 3D CNN model. Our experiments results show that the Kinetics-pretrained model can generally outperform ImageNet-pretrained model. And our proposed network finally achieve leading performance on UCF-101 dataset.
机译:行动识别最近已经成为一个加热的研究主题,试图在视频中对不同的人类行为进行分类。当前的主流方法通常使用ImageNet-Pretry模型作为特征提取器,但它不是预先绘制在巨大静止图像数据集上进行视频模型的最佳选择。更重要的是,很少有工作注意到3D卷积神经网络(3D CNN)对于低电平空间时间特征提取更好,而经常性神经网络(RNN)更好地用于建模高级时间特征序列。因此,在我们的工作中提出了一种新颖的模型来解决上述两个问题。首先,我们在巨大的视频动作识别数据集动力学上预先rain 3D CNN模型,提高模型的一般性。然后引入了长期短期存储器(LSTM)以模拟动力学 - 掠夺3D CNN模型产生的高级时间特征。我们的实验结果表明,动力学 - 预用模型通常可以呈现成像预热模型。我们所提出的网络终于在UCF-101数据集上实现了领先的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号