首页> 外文会议>Asian Conference on Computer Vision >Video-Based Person Re-identification via 3D Convolutional Networks and Non-local Attention

【24h】

Video-Based Person Re-identification via 3D Convolutional Networks and Non-local Attention

机译：基于视频的人通过3D卷积网络和非本地关注重新识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Video-based person re-identification (ReID) is a challenging problem, where some video tracks of people across non-overlapping cameras are available for matching. Feature aggregation from a video track is a key step for video-based person ReID. Many existing methods tackle this problem by average/maximum temporal pooling or RNNs with attention. However, these methods cannot deal with temporal dependency and spatial misalignment problems at the same time. We are inspired by video action recognition that involves the identification of different actions from video tracks. Firstly, we use 3D convolutions on video volume, instead of using 2D convolutions across frames, to extract spatial and temporal features simultaneously. Secondly, we use a nonlocal block to tackle the misalignment problem and capture spatial-temporal long-range dependencies. As a result, the network can learn useful spatial-temporal information as a weighted sum of the features in all space and temporal positions in the input feature map. Experimental results on three datasets show that our framework outperforms state-of-the-art approaches by a large margin on multiple metrics.

机译：基于视频的人重新识别（Reid）是一个具有挑战性的问题，其中跨越非重叠摄像机的一些视频轨道可用于匹配。来自视频轨道的特征聚合是基于视频的人Reid的关键步骤。许多现有方法通过平均/最大时间汇总或引起注意力来解决这个问题。但是，这些方法不能同时处理时间依赖性和空间未对准问题。我们受到视频动作识别的启发，涉及识别视频轨道的不同动作。首先，我们在视频音量上使用3D卷积，而不是在帧上使用2D卷积，同时提取空间和时间特征。其次，我们使用非局部块来解决未对准问题并捕获空间时间远程依赖性。结果，网络可以将有用的空间信息学习为输入特征图中的所有空间和时间位置中的特征的加权之和。三个数据集的实验结果表明，我们的框架在多个度量标准上的大幅度占据了最先进的方法。

著录项

来源
《Asian Conference on Computer Vision》|2019年|xx 750 p.|共15页
会议地点
作者
Xingyu Liao; Lingxiao He; Zhouwang Yang; Chi Zhang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类理论、方法;
关键词

相似文献

外文文献
中文文献
专利

1. Video-based person re-identification via spatio-temporal attentional and two-stream fusion convolutional networks [J] . Ouyang Deqiang, Zhang Yonghui, Shao Jie Pattern recognition letters . 2019,第JANa期

机译：通过时空注意和两流融合卷积网络的基于视频的人重新识别
2. Non-Local Spatial and Temporal Attention Network for Video-Based Person Re-Identification [J] . Zheng Liu, Feixiang Du, Wang Li, Applied Sciences . 2020,第15期

机译：基于视频的人的非局部空间和时间注意网络重新识别
3. Where-and-When to Look: Deep Siamese Attention Networks for Video-Based Person Re-Identification [J] . Wu Lin, Wang Yang, Gao Junbin, IEEE transactions on multimedia . 2019,第6期

机译：何时何地：基于视频的人员重新识别的深层暹罗注意网络
4. Video-Based Person Re-identification via 3D Convolutional Networks and Non-local Attention [C] . Xingyu Liao, Lingxiao He, Zhouwang Yang, Asian Conference on Computer Vision . 2019

机译：基于视频的人通过3D卷积网络和非本地关注重新识别
5. Attention Based Temporal Convolutional Neural Network for Real-Time 3D Human Pose Reconstruction [D] . ?Liu, Ruixu 2019

机译：基于注意的时间卷积神经网络实时3D人体姿态重建
6. Relation-Based Deep Attention Network with Hybrid Memory for One-Shot Person Re-Identification [O] . Runxuan Si, Jing Zhao, Yuhua Tang, 2021

机译：基于关系的深度关注网络用于单击人的混合记忆重新识别
7. Non-Local Spatial and Temporal Attention Network for Video-Based Person Re-Identification [O] . Zheng Liu, Feixiang Du, Wang Li, 2020

机译：基于视频的人的非局部空间和时间注意网络重新识别

Video-Based Person Re-identification via 3D Convolutional Networks and Non-local Attention

摘要

著录项

相似文献

相关主题

期刊订阅