首页> 外文会议>IEEE Conference on Computer Vision and Pattern Recognition Workshops >Deep Spatial-Temporal Fusion Network for Video-Based Person Re-identification
【24h】

Deep Spatial-Temporal Fusion Network for Video-Based Person Re-identification

机译:深度时空融合网络用于基于视频的人员重新识别

获取原文

摘要

In this paper, we propose a novel deep end-to-end network to automatically learn the spatial-temporal fusion features for video-based person re-identification. Specifically, the proposed network consists of CNN and RNN to jointly learn both the spatial and the temporal features of input image sequences. The network is optimized by utilizing the siamese and softmax losses simultaneously to pull the instances of the same person closer and push the instances of different persons apart. Our network is trained on full-body and part-body image sequences respectively to learn complementary representations from holistic and local perspectives. By combining them together, we obtain more discriminative features that are beneficial to person re-identification. Experiments conducted on the PRID-2011, i-LIDS-VIS and MARS datasets show that the proposed method performs favorably against existing approaches.
机译:在本文中,我们提出了一种新颖的深度端到端网络,该网络可以自动学习时空融合特征,以进行基于视频的人员重新识别。具体来说,所提出的网络由CNN和RNN组成,以共同学习输入图像序列的空间和时间特征。通过同时利用暹罗和softmax损失来拉近同一个人的实例并将不同个人的实例推开,从而优化了网络。我们的网络分别经过全身和局部图像序列训练,以从整体和局部角度学习互补表示。通过将它们组合在一起,我们获得了更多的区分特征,这些特征有利于人的重新识别。在PRID-2011,i-LIDS-VIS和MARS数据集上进行的实验表明,该方法相对于现有方法具有良好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号