首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >3-D PersonVLAD: Learning Deep Global Representations for Video-Based Person Reidentification
【24h】

3-D PersonVLAD: Learning Deep Global Representations for Video-Based Person Reidentification

机译:3-D PersonVLAD:学习基于视频的人员识别的深层全局表示

获取原文
获取原文并翻译 | 示例

摘要

We present the global deep video representation learning to video-based person reidentification (re-ID) that aggregates local 3-D features across the entire video extent. Existing methods typically extract frame-wise deep features from 2-D convolutional networks (ConvNets) which are pooled temporally to produce the video-level representations. However, 2-D ConvNets lose temporal priors immediately after the convolutions, and a separate temporal pooling is limited in capturing human motion in short sequences. In this paper, we present global video representation learning, to be complementary to 3-D ConvNets as a novel layer to capture the appearance and motion dynamics in full-length videos. Nevertheless, encoding each video frame in its entirety and computing aggregate global representations across all frames is tremendously challenging due to the occlusions and misalignments. To resolve this, our proposed network is further augmented with the 3-D part alignment to learn local features through the soft-attention module. These attended features are statistically aggregated to yield identity-discriminative representations. Our global 3-D features are demonstrated to achieve the state-of-the-art results on three benchmark data sets: MARS, Imagery Library for Intelligent Detection Systems-Video Re-identification, and PRID2011.
机译:我们介绍了对基于视频的人员重新识别(re-ID)的全球深度视频表示学习,该人员在整个视频范围内汇总了本地3-D功能。现有方法通常从2-D卷积网络(ConvNets)中提取逐帧深度特征,这些特征在时间上合并以产生视频级表示。但是,二维卷积网络在卷积之后立即失去时间先验,并且在以短序列捕获人类运动时,单独的时间池受到限制。在本文中,我们介绍了全局视频表示学习,作为对3-D ConvNets的一种新颖层,以捕获全长视频中的外观和运动动态。然而,由于遮挡和未对准,对每个视频帧进行整体编码并计算所有帧上的聚合全局表示仍然是巨大的挑战。为了解决这个问题,我们提出的网络进一步增加了3-D零件对齐方式,以通过软注意力模块学习局部特征。这些有人参与的功能在统计上进行了汇总,以产生区分身份的表示形式。我们的全球3-D功能在三个基准数据集上得到了最新的证明:MARS,用于智能检测系统的图像库-视频重新识别以及PRID2011。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号