Fine-Grained Visual-Textual Representation Learning

机译：细粒度视觉文本代表学习

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Fine-grained image classification is to recognize hundreds of subcategoriesbelonging to the same basic-level category, which is a highly challenging taskdue to the quite subtle visual distinctions among similar subcategories. Mostexisting methods generally learn part detectors to discover discriminativeregions for better performance. However, not all localized parts are beneficialand indispensable for classification, and the setting for number of partdetectors relies heavily on prior knowledge as well as experimental results. Asis known to all, when we describe the object of an image into text via naturallanguage, we only focus on the pivotal characteristics, and rarely payattention to common characteristics as well as the background areas. This is aninvoluntary transfer from human visual attention to textual attention, whichleads to the fact that textual attention tells us how many and which parts arediscriminative and significant. So textual attention of natural languagedescriptions could help us to discover visual attention in image. Inspired bythis, we propose a visual-textual attention driven fine-grained representationlearning (VTA) approach, and its main contributions are: (1) Fine-grainedvisual-textual pattern mining devotes to discovering discriminativevisual-textual pairwise information for boosting classification through jointlymodeling vision and text with generative adversarial networks (GANs), whichautomatically and adaptively discovers discriminative parts. (2) Visual-textualrepresentation learning jointly combine visual and textual information, whichpreserves the intra-modality and inter-modality information to generatecomplementary fine-grained representation, and further improve classificationperformance. Experiments on the two widely-used datasets demonstrate theeffectiveness of our VTA approach, which achieves the best classificationaccuracy.

机译：细粒度的图像分类是为了识别数百个子类别雄厚地到相同的基本级别类别，这是一个非常具有挑战性的TaskDue，以与类似的子类别中的相当微妙的视觉区别。大部分方法通常学习零件探测器以发现歧视以获得更好的性能。然而，并非所有本地化部分都是必不可少的分类，并且零部数的设置严重依赖于先前的知识以及实验结果。所有人都知道的，当我们通过Naturallanguage将图像的对象描述为文本时，我们只关注关键特征，很少有人对共同特征以及背景区域的支付。这是一种从人类视觉关注对文本关注的一个非自愿转移，这对文本的注意力告诉我们有多少，并且群体受到刺激性和重大的影响。如此对天然朗格尼斯的文本关注可以帮助我们发现图像中的视觉注意力。受到影响，我们提出了一种视觉文本的注意力驱动的细粒度代表性学（VTA）方法，其主要贡献是：（1）细粒度文本模式挖掘致力于发现通过联合意见来促进分类的鉴别性visual-teachweive信息和具有生成对冲网络（GANS）的文本，适于和自适应地发现歧视部位。（2）Visual-TexturePresentation学习共同组合视觉和文本信息，将模态内部和模态信息进行换句话，并进一步提高分类性能。两种广泛使用的数据集的实验表明了我们的VTA方法的无效，这实现了最佳的分类。

著录项

作者
Xiangteng He; Yuxin Peng;
展开▼
作者单位

展开▼
年度 2020
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. Fine-Grained Visual-Textual Representation Learning [J] . He Xiangteng, Peng Yuxin IEEE Transactions on Circuits and Systems for Video Technology . 2020,第2期

机译：细粒度视觉文本代表学习
2. Catch me if you can: A participant-level rumor detection framework via fine-grained user representation learning [J] . Xueqin Chen, Fan Zhou, Fengli Zhang, Information Processing & Management . 2021,第5期

机译：如果您可以：通过细粒度用户表示学习的参与级谣言检测框架
3. Fine-Grained Privacy Detection with Graph-Regularized Hierarchical Attentive Representation Learning [J] . Chen Xiaolin, Song Xuemeng, Ren Ruiyang, ACM Transactions on Information Systems . 2020,第4期

机译：具有图形正则化分层细心表示学习的细粒度隐私检测
4. Joint Learning on the Hierarchy Representation for Fine-Grained Human Action Recognition [C] . Mei Chee Leong, Hui Li Tan, Haosong Zhang, IEEE International Conference on Image Processing . 2021

机译：关于细粒度人体行动认可的层次陈述的联合学习
5. Fine-grained Visual Representation Learning with Deep Neural Networks [D] . Xu, Tao. 2018

机译：深度神经网络的细粒度视觉表示学习
6. Representation Learning for Fine-Grained Change Detection [O] . Niall O’ Mahony, Sean Campbell, Lenka Krpalkova, 2021

机译：用于细粒度变化检测的表示学习
7. DeepKSPD: Learning Kernel-Matrix-Based SPD Representation For Fine-Grained Image Recognition [O] . Melih Engin, Lei Wang, Luping Zhou, 2018

机译：Deepkspd：学习基于内核的基于矩阵的SPD表示，用于细粒度的图像识别

Fine-Grained Visual-Textual Representation Learning

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅