Dual-Triplet Metric Learning for Unsupervised Domain Adaptation in Video Face Recognition

机译：视频人脸识别中无监督域自适应的双重三重度量学习

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The scalability and complexity of deep learning models remains a key issue in many of visual recognition applications. For instance, in video surveillance, fine tuning of a model with labeled image data from each new camera is required to reduce the domain shift between videos captured from the source domain (laboratory setting) and the target domain (operational environment). In many video surveillance applications, like face recognition and person re-identification, a pair-wise matcher is typically employed to assign a query image captured using a video camera to the corresponding reference images in a gallery. The different configuration, viewpoint, and operational conditions of each camera can introduce significant shifts in pair-wise distance distributions, resulting in a decline in recognition performance for new cameras. In this paper, a new deep domain adaptation (DA) method is proposed to adapt the CNN embedding of a Siamese network using unlabeled tracklets captured with a new video camera. To this end, a dual-triplet loss is introduced for metric learning, where two triplets are constructed using video data from a source camera, and a new target camera. In order to constitute the dual triplets, a mutual-supervised learning approach is introduced where the source camera acts as a teacher, providing the target camera with an initial embedding. Then, the student relies on the teacher to iteratively label the positive and negative pairs collected during, e.g., initial camera calibration. Both source and target embeddings continue to simultaneously learn such that their pair-wise distance distributions become aligned. For validation, the proposed metric learning technique is used to train deep Siamese networks under different training scenarios, and is compared to state-of-the-art techniques for still-to-video FR on the COXS2V and a private video-based FR dataset. Results indicate that the proposed method can provide a level of accuracy that is comparable to the upper bound performance, in training scenario where labeled target data is employed to fine-tune the Siamese network.

机译：深度学习模型的可伸缩性和复杂性仍然是许多视觉识别应用程序中的关键问题。例如，在视频监视中，需要对每个新摄像机带有标签图像数据的模型进行微调，以减少从源域（实验室设置）和目标域（操作环境）捕获的视频之间的域偏移。在许多视频监视应用中，例如人脸识别和人员重新识别，通常采用成对匹配器将使用摄像机捕获的查询图像分配给图库中的相应参考图像。每个摄像机的不同配置，视点和操作条件可能会导致成对距离分布发生重大变化，从而导致新摄像机的识别性能下降。在本文中，提出了一种新的深域适应（DA）方法，该方法使用由新摄像机捕获的未标记小径来适应暹罗网络的CNN嵌入。为此，为度量学习引入了双三重态损失，其中使用来自源摄像机和新目标摄像机的视频数据构造两个三重态。为了构成双重三元组，引入了一种相互监督的学习方法，其中源摄像机充当教师，为目标摄像机提供初始嵌入。然后，学生依靠老师迭代地标记例如在初始相机校准期间收集的正对和负对。源和目标嵌入都继续同时学习，以使它们的成对距离分布变得对齐。为了进行验证，所提出的度量学习技术用于在不同的训练场景下训练深层的暹罗网络，并与COXS2V上的静态视频FR和基于视频的专用FR数据集的最新技术进行了比较。。结果表明，在采用标记目标数据对暹罗网络进行微调的训练场景中，所提出的方法可以提供与上限性能相当的准确性。

著录项

来源
《International Joint Conference on Neural Networks》|2020年|1-9|共9页
会议地点
作者
George Ekladious; Hugo Lemoine; Eric Granger; Kaveh Kamali; Salim Moudache;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Faces; Cameras; Target tracking; Adaptation models; Calibration; Video surveillance;

机译：人脸;相机;目标跟踪;适应模型;校准;视频监控;

相似文献

外文文献
中文文献
专利

1. Joint metric and feature representation learning for unsupervised domain adaptation [J] . Xie Yue, Du Zhekai, Li Jingjing, Knowledge-Based Systems . 2020,第Mara15期

机译：联合度量和特征表示学习，实现无监督领域自适应
2. Multi-metric domain adaptation for unsupervised transfer learning [J] . Yang Hongwei, He Hui, Li Tao, Image Processing, IET . 2020,第12期

机译：无监督转移学习的多度量域适应
3. Transfer metric learning for unsupervised domain adaptation [J] . Huang Junchu, Zhou Zhiheng Image Processing, IET . 2019,第5期

机译：用于无监督域自适应的转移度量学习
4. Unsupervised Domain Adaptation for Face Recognition in Unlabeled Videos [C] . Kihyuk Sohn, Sifei Liu, Guangyu Zhong, IEEE International Conference on Computer Vision . 2017

机译：未经标记视频中面部识别的无监督域适应
5. Face Recognition in Video Surveillance from a Single Reference Sample Through Domain Adaptation =Reconnaissance de visages en vidéosurveillance à partir d'un échantillon de référence unique à par l'adaptation de domaine [D] . Bashbaghi, Saman. 2017

机译：通过域自适应从单个参考样本中进行视频监控中的人脸识别
6. Unsupervised Domain Adaptation for Facial Expression Recognition Using Generative Adversarial Networks [O] . Xiaoqing Wang, Xiangjun Wang, Yubo Ni 2018

机译：使用生成式对抗网络进行面部表情识别的无监督域自适应
7. Unsupervised Domain Adaptation for Face Recognition in Unlabeled Videos [O] . Sohn, Kihyuk, Liu, Sifei, Zhong, Guangyu, 2017

机译：未标记视频中人脸识别的无监督域自适应

Dual-Triplet Metric Learning for Unsupervised Domain Adaptation in Video Face Recognition

摘要

著录项

相似文献

相关主题

期刊订阅