Unsupervised Adversarial Visual Level Domain Adaptation for Learning Video Object Detectors From Images

机译：从图像中学习视频对象检测器的无监督对抗视觉水平域自适应

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Deep learning based object detectors require thousands of diversified bounding box and class annotated examples. Though image object detectors have shown rapid progress in recent years with release of multiple large scale static image datasets, object detection on videos still remains an open problem due to scarcity of annotated video frames. Having a robust video object detector is an essential component for video understanding and curating large scale automated annotations in videos. Domain difference between images and videos makes the transferability of image object detectors to videos sub-optimal. The most common solution is to use weakly supervised annotations where a video frame has to be tagged for presence/absence of object categories. This still takes up manual effort. In this paper we take a step forward by adapting the concept of unsupervised adversarial image-to-image translation to perturb static high quality images to be visually indistinguishable from a set of video frames. We assume the presence of a fully annotated static image dataset and an unannotated video dataset. Object detector is trained on adversarially transformed image dataset using the annotations of the original dataset. Experiments on Youtube-Objects and Youtube-Objects-Subset datasets with two contemporary baseline object detectors reveal that such unsupervised pixel level domain adaptation boosts the generalization performance on video frames compared to direct application of original image object detector. Also, we achieve competitive performance compared to recent baselines of weakly supervised methods. This paper can be seen as an application of image translation for cross domain object detection. Codes available at https://github.com/avisekiit/wacv_2019.

机译：基于深度学习的对象检测器需要成千上万个多样化的边界框和类注释示例。尽管近年来随着多个大规模静态图像数据集的发布，图像对象检测器显示出快速的发展，但由于带注释的视频帧的稀缺性，视频对象检测仍然是一个未解决的问题。拥有强大的视频对象检测器是视频理解和策划视频中大规模自动注释的重要组成部分。图像和视频之间的域差异使得图像对象检测器到视频的可传递性不是最佳的。最常见的解决方案是使用弱监督的注释，其中必须标记视频帧的存在/不存在对象类别。这仍然需要人工。在本文中，我们通过将无监督对抗图像间转换的概念改编为扰动静态高质量图像，从而与一组视频帧在视觉上无法区分，从而向前迈进了一步。我们假设存在完全注释的静态图像数据集和未注释的视频数据集。使用原始数据集的注释，在对抗性变换后的图像数据集上训练对象检测器。使用两个当代基准对象检测器对Youtube-Objects和Youtube-Objects-Subset数据集进行的实验表明，与直接应用原始图像对象检测器相比，这种无监督的像素级域自适应提高了视频帧的泛化性能。此外，与弱监督方法的最新基准相比，我们获得了竞争优势。本文可以看作是图像翻译在跨域目标检测中的应用。可在https://github.com/avisekiit/wacv_2019上找到的代码。

著录项

来源
《IEEE Winter Conference on Applications of Computer Vision》|2019年|1807-1815|共9页
会议地点
作者
Avisek Lahiri; Sri Charan Ragireddy; Prabir Biswas; Pabitra Mitra;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Detectors; Training; Object detection; Transforms; Generators; Gallium nitride; Image color analysis;

机译：探测器;训练;目标检测;变换;发生器;氮化镓;图像色彩分析;

相似文献

外文文献
中文文献
专利

1. Joint deep feature learning and unsupervised visual domain adaptation for cross-domain 3D object retrieval [J] . Wen-Hui Li, Shu Xiang, Wei-Zhi Nie, Information Processing & Management . 2020,第5期

机译：联合深度特征学习和无监督的跨域3D对象检索的视域适应
2. A Method for Vehicle Detection in High-Resolution Satellite Images that Uses a Region-Based Object Detector and Unsupervised Domain Adaptation [J] . International journal of applied mechanics . 2020,第3期

机译：一种在高分辨率卫星图像中的车辆检测方法，该方法使用基于区域的对象检测器和无监督域自适应
3. Learning Kernels for Unsupervised Domain Adaptation with Applications to Visual Object Recognition [J] . Boqing Gong, Kristen Grauman, Fei Sha International Journal of Computer Vision . 2014,第1a2期

机译：学习内核的无监督域自适应及其在视觉对象识别中的应用
4. Unsupervised Adversarial Visual Level Domain Adaptation for Learning Video Object Detectors From Images [C] . Avisek Lahiri, Sri Charan Ragireddy, Prabir Biswas, IEEE Winter Conference on Applications of Computer Vision . 2019

机译：无监督的对抗视觉级别域适配用于从图像中学习视频对象探测器
5. Deep Adversarial Learning Based Domain Adaptation for Mulit-Modal Image Analysis [D] . Makkar, Nikhil. 2018

机译：基于深层的逆势学习的Mulit模态图像分析域改编
6. Unsupervised Domain Adaptation for Facial Expression Recognition Using Generative Adversarial Networks [O] . Xiaoqing Wang, Xiangjun Wang, Yubo Ni 2018

机译：使用生成式对抗网络进行面部表情识别的无监督域自适应
7. Unsupervised Learning for Cell-level Visual Representation in Histopathology Images with Generative Adversarial Networks [O] . Hu, Bo, Tang, Ye, Chang, Eric I-Chao, 2018

机译：无监督学习的细胞级视觉表征具有生成性对抗网络的组织病理学图像

Unsupervised Adversarial Visual Level Domain Adaptation for Learning Video Object Detectors From Images

摘要

著录项

相似文献

相关主题

期刊订阅