首页> 外文会议>Chinese Automation Congress >Action Video Recognition Framework based on NetVLAD with Data Augmentation
【24h】

Action Video Recognition Framework based on NetVLAD with Data Augmentation

机译:基于NetVLAD数据增强的动作视频识别框架

获取原文

摘要

In this paper, we propose an end-to-end deep learning framework that could extract the global spatial-temporal features from the videos based on the NetVladframework optimized with data augmentation. Given keyframes extracted from the original videos, we have presented a three-step action recognition framework: the first step of our framework is given by a data augmentation based on central crop, random crop and keyframe scaling. Then the second step is given by a local feature descriptor of each frame with Two-Stream Inflated 3D ConvNet (I3D) that is based on 2D ConvNet inflation, which enables the network to capture spatial and temporal features simultaneously. The third step is given by aggregating global features for action recognition through the new generalized “Vector of Locally Aggregated Descriptors” (NetVLAD) layer optimized with a novel pooling strategy, avoiding the misjudgment caused by local features. The whole framework is trained and fine-tuned via an end-to-end way. It's demonstrated that the performance of our framework outperforms the state-of-the-art algorithms on UCF10l dataset. The competitive results clearly reveal an efficient action recognition of high accuracy (up to 91.25%) in fast time (close to 1.8s), which will significantly improve performance and application of video action recognition.
机译:在本文中,我们提出了一个端到端的深度学习框架,该框架可以基于经过数据增强优化的NetVladframework从视频中提取全局时空特征。给定从原始视频中提取的关键帧,我们提出了一个三步动作识别框架:我们的框架的第一步是通过基于中心裁剪,随机裁剪和关键帧缩放的数据增强来给出的。然后,第二步由具有基于2D ConvNet膨胀的两流膨胀3D ConvNet(I3D)的每个帧的局部特征描述符给出,这使网络可以同时捕获空间和时间特征。第三步是通过使用一种新型池策略优化的新的广义“局部聚集描述符向量”(NetVLAD)层聚合用于动作识别的全局特征,从而避免了由局部特征引起的误判。整个框架通过端到端的方式进行了培训和微调。证明了我们框架的性能优于UCF10l数据集上的最新算法。竞争结果清楚地显示了快速(接近1.8s)的高精度(高达91.25%)的有效动作识别,这将显着提高视频动作识别的性能和应用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号