Deep0Tag: Deep Multiple Instance Learning for Zero-Shot Image Tagging

首页> 外文期刊>IEEE transactions on multimedia >Deep0Tag: Deep Multiple Instance Learning for Zero-Shot Image Tagging

【24h】

Deep0Tag: Deep Multiple Instance Learning for Zero-Shot Image Tagging

机译：Deep0Tag：用于零镜头图像标记的深度多实例学习

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Zero-shot learning aims to perform visual reasoning about unseen objects. In-line with the success of deep learning on object recognition problems, several end-to-end deep models for zero-shot recognition have been proposed in the literature. These models are successful in predicting a single unseen label given an input image but do not scale to cases where multiple unseen objects are present. Here, we focus on the challenging problem of zero-shot image tagging, where multiple labels are assigned to an image, that may relate to objects, attributes, actions, events, and scene type. Discovery of these scene concepts requires the ability to process multi-scale information. To encompass global as well as local image details, we propose an automatic approach to locate relevant image patches and model image tagging within the Multiple Instance Learning (MIL) framework. To the best of our knowledge, we propose the first end-to-end trainable deep MIL framework for the multi-label zero-shot tagging problem. We explore several alternatives for instance-level evidence aggregation and perform an extensive ablation study to identify the optimal pooling strategy. Due to its novel design, the proposed framework has several interesting features: 1) unlike previous deep MIL models, it does not use any off-line procedure (e.g., Selective Search or EdgeBoxes) for bag generation. 2) During test time, it can process any number of unseen labels given their semantic embedding vectors. 3) Using only image-level seen labels as weak annotation, it can produce a localized bounding box for each predicted label. We experiment with the large-scale NUS-WIDE and MS-COCO datasets and achieve superior performance across conventional, zero-shot, and generalized zero-shot tagging tasks.

机译：零镜头学习旨在对看不见的物体进行视觉推理。与深度学习在对象识别问题上取得的成功相一致，文献中提出了几种零击识别的端到端深度模型。这些模型可以成功地预测给定输入图像的单个看不见的标签，但不能缩放到存在多个看不见的物体的情况。在这里，我们重点关注零镜头图像标记这一具有挑战性的问题，在该图像中，为图像分配了多个标签，这可能与对象，属性，动作，事件和场景类型有关。这些场景概念的发现需要能够处理多尺度信息。为了涵盖全局和局部图像详细信息，我们提出了一种自动方法来在多实例学习（MIL）框架内定位相关图像补丁和模型图像标记。据我们所知，我们提出了第一个端到端可训练的深度MIL框架，用于解决多标签零击标记问题。我们探索了一些实例级证据汇总的替代方法，并进行了广泛的消融研究，以确定最佳的合并策略。由于其新颖的设计，提出的框架具有几个有趣的功能：1）与以前的深度MIL模型不同，它不使用任何离线过程（例如，选择性搜索或EdgeBox）来生成袋子。 2）在测试期间，给定它们的语义嵌入向量，它可以处理任意数量的看不见的标签。 3）仅使用图像级别的可见标签作为弱注释，它可以为每个预测的标签生成一个局部边界框。我们对大型NUS-WIDE和MS-COCO数据集进行了实验，并在常规，零镜头和广义零镜头标记任务中实现了卓越的性能。

著录项

来源
《IEEE transactions on multimedia》 |2020年第1期|242-255|共14页
作者

展开▼
作者单位

Australian Natl Univ Res Sch Engn Canberra ACT 2601 Australia|CSIRO Data61 Canberra ACT 2601 Australia;

Australian Natl Univ Res Sch Engn Canberra ACT 2601 Australia|Incept Inst Artificial Intelligence Abu Dhabi U Arab Emirates;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Deep learning; Multiple instance learning; Feature pooling; Object detection; Zero-shot tagging;

机译：深度学习;多实例学习;功能池;对象检测;零镜头标记;

相似文献

外文文献
中文文献
专利

1. Classifying and segmenting microscopy images with deep multiple instance learning [J] . Kraus Oren Z., Ba Jimmy Lei, Frey Brendan J. Bioinformatics . 2016,第12期

机译：通过深度多实例学习对显微镜图像进行分类和分段
2. Deep multiple instance learning for airplane detection in high-resolution imagery [J] . Mohammad Reza Mohammadi Machine Vision and Applications . 2021,第1期

机译：高分辨率图像中飞机检测的深层多实例学习
3. UD-MIL: Uncertainty-Driven Deep Multiple Instance Learning for OCT Image Classification [J] . Xi Wang, Fangyao Tang, Hao Chen, Biomedical and Health Informatics, IEEE Journal of . 2020,第12期

机译：UD-MIL：OCT图像分类的不确定性驱动的深度多实例学习
4. Deep Multiple Instance Learning for Zero-Shot Image Tagging [C] . Shafin Rahman, Salman Khan Asian Conference on Computer Vision . 2019

机译：深度多实例学习零拍摄图像标记
5. Adaptive mean shift-based image segmentation using multiple instance learning. [D] . Xu, Tao. 2009

机译：使用多实例学习的自适应均值漂移图像分割。
6. Classifying and segmenting microscopy images with deep multiple instance learning [O] . Oren Z. Kraus, Jimmy Lei Ba, Brendan J. Frey -1

机译：通过深度多实例学习对显微镜图像进行分类和分段
7. Deep multiple instance learning classifies subtissue locations in mass spectrometry images from tissue-level annotations [O] . Dan Guo, Melanie Christine Föll, Veronika Volkmann, 2020

机译：深度多实例学习将质谱层中的子字形位置分类在组织级注释中
8. Gaussian Multiple Instance Learning Approach for Mapping the Slums of the World Using Very High Resolution Imagery. [R] . Vatsava, R. R. 2013

机译：利用超高分辨率图像绘制世界贫民窟的高斯多实例学习方法。

Deep0Tag: Deep Multiple Instance Learning for Zero-Shot Image Tagging

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅