SCAN: Self-and-Collaborative Attention Network for Video Person Re-Identification

Zhang Ruimao; Li Jingyu; Sun Hongbin; Ge Yuying; Luo Ping; Wang Xiaogang; Lin Liang

首页> 外文期刊>IEEE Transactions on Image Processing >SCAN: Self-and-Collaborative Attention Network for Video Person Re-Identification

【24h】

SCAN: Self-and-Collaborative Attention Network for Video Person Re-Identification

机译：SCAN：用于重新识别视频人的自协作注意力网络

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Video person re-identification has attracted much attention in recent years. It aims to match image sequences of pedestrians from different camera views. Previous approaches usually improve this task from three aspects, including: 1) selecting more discriminative frames; 2) generating more informative temporal representations; and 3) developing more effective distance metrics. To address the above issues, we present a novel and practical deep architecture for video person re-identification termed self-and-collaborative attention network (SCAN), which adopts the video pairs as the input and outputs their matching scores. SCAN has several appealing properties. First, SCAN adopts a non-parametric attention mechanism to refine the intra-sequence and inter-sequence feature representation of videos and outputs self-and-collaborative feature representation for each video, making the discriminative frames aligned between the probe and gallery sequences. Second, beyond the existing models, a generalized pairwise similarity measurement is proposed to generate the similarity feature representation of video pair by calculating the Hadamard product of their self-representation difference and collaborative-representation difference. Thus, the matching result can be predicted by the binary classifier. Third, a dense clip segmentation strategy is also introduced to generate rich probe-gallery pairs to optimize the model. In the test phase, the final matching score of two videos is determined by averaging the scores of top-ranked clip-pairs. Extensive experiments demonstrate the effectiveness of SCAN, which outperforms the top-1 accuracies of the best-performing baselines on iLIDS-VID, PRID2011, and MARS datasets, respectively.

机译：视频人的重新识别近年来引起了很多关注。它旨在匹配来自不同摄像机视角的行人的图像序列。先前的方法通常从三个方面改进此任务，包括：1）选择更多的判别框架； 2）生成更多信息的时间表示； 3）制定更有效的距离指标。为了解决上述问题，我们提出了一种新颖且实用的用于视频人重新识别的深度架构，称为自协作注意力网络（SCAN），该架构将视频对作为输入并输出其匹配分数。 SCAN具有几个吸引人的属性。首先，SCAN采用非参数注意机制来完善视频的序列内和序列间特征表示，并为每个视频输出自协作特征表示，从而使区分帧在探针序列与图库序列之间对齐。其次，在现有模型的基础上，提出了一种通用的成对相似度度量方法，通过计算视频对的自表示差异和协作表示差异的Hadamard乘积来生成视频对的相似特征表示。因此，可以通过二进制分类器来预测匹配结果。第三，还引入了密集的片段分割策略以生成丰富的探针库对以优化模型。在测试阶段，通过平均排名最高的剪辑对的分数来确定两个视频的最终匹配分数。大量的实验证明了SCAN的有效性，其性能优于iLIDS-VID，PRID2011和MARS数据集上表现最佳的基线的前1个准确性。

著录项

来源
《IEEE Transactions on Image Processing》 |2019年第10期|4870-4882|共13页
作者
Zhang Ruimao; Li Jingyu; Sun Hongbin; Ge Yuying; Luo Ping; Wang Xiaogang; Lin Liang;
展开▼
作者单位

Chinese Univ Hong Kong Dept Elect Engn Hong Kong Peoples R China;

Sensetime Res Shenzhen Peoples R China;

Sun Yat Sen Univ Sch Data & Comp Sci Guangzhou 510006 Guangdong Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Temporal modeling; similarity measurement; collaborative representation; person re-identification; attention mechanism;

机译：时间建模;相似度测量合作代表;人员重新识别;注意机制;

相似文献

外文文献
中文文献
专利

1. Video-based person re-identification via spatio-temporal attentional and two-stream fusion convolutional networks [J] . Ouyang Deqiang, Zhang Yonghui, Shao Jie Pattern recognition letters . 2019,第JANa期

机译：通过时空注意和两流融合卷积网络的基于视频的人重新识别
2. Where-and-When to Look: Deep Siamese Attention Networks for Video-Based Person Re-Identification [J] . Wu Lin, Wang Yang, Gao Junbin, IEEE transactions on multimedia . 2019,第6期

机译：何时何地：基于视频的人员重新识别的深层暹罗注意网络
3. Where-and-When to Look: Deep Siamese Attention Networks for Video-Based Person Re-Identification [J] . Wu Lin, Wang Yang, Gao Junbin, IEEE transactions on multimedia . 2019,第6期

机译：在哪里 - 和何时看的：深度暹罗关注网络用于视频的人重新识别
4. Co-Segmentation Inspired Attention Networks for Video-Based Person Re-Identification [C] . Arulkumar Subramaniam, Athira Nambiar, Anurag Mittal International Conference on Computer Vision . 2019

机译：基于细分的启发式注意力网络，用于基于视频的人员重新识别
5. Person Re-Identification and an Adversarial Attack and Defense for Person Re-Identification Networks [D] . Zheng, Yu. 2021

机译：人员重新识别和对侵犯人员重新识别网络的侵犯攻击和辩护
6. Relation-Based Deep Attention Network with Hybrid Memory for One-Shot Person Re-Identification [O] . Runxuan Si, Jing Zhao, Yuhua Tang, 2021

机译：基于关系的深度关注网络用于单击人的混合记忆重新识别
7. Non-Local Spatial and Temporal Attention Network for Video-Based Person Re-Identification [O] . Zheng Liu, Feixiang Du, Wang Li, 2020

机译：基于视频的人的非局部空间和时间注意网络重新识别

SCAN: Self-and-Collaborative Attention Network for Video Person Re-Identification

摘要

著录项

相似文献

相关主题

期刊订阅