首页> 外文期刊>Neurocomputing >Multi-level dictionary learning for fine-grained images categorization with attention model
【24h】

Multi-level dictionary learning for fine-grained images categorization with attention model

机译:多级词典学习对注意模型的细粒度图像分类

获取原文
获取原文并翻译 | 示例
           

摘要

Fine-grained image categorization is a challenging task due to the difficulty of localizing the discriminative regions for different sub-categories. Previous works mainly focus on using the manual annotations or the attention algorithm to localize these regions, which is demanding and complex in practical applications. This paper proposes a method of using a multi-level attention model (MLA-CNN) which has been trained on the full-size image train set of current tasks to localize the most discriminative regions. Intuitively, three typical receptive field sizes are selected for the multi-level attention maps. Then, multi-level dictionary learning is introduced to extract discriminative features from these localized regions. Our method explores a new thought about how to use the neural activations to generate multi-scale regions which are helpful for the fine-grained categorization. The method can be achieved in two steps. The first step is to select the neurons that have the max activation in the selected three feature maps. These feature maps are the outputs of the pre-trained CNN model by feeding the full-size images into the model. Then, we generate the discriminative regions according to the receptive field size of the selected neurons. The second step is to train the subtle networks with these multi-scale regions. One scaled discriminative region can be regarded as one typical dictionary feature. Then these results are integrated for final prediction. We evaluate our method on three challenging fine-grained image datasets, CUB-200-2011, Stanford Dogs, and Stanford Cars. The experimental results demonstrate that our method outperforms many state-of-the-art methods, using extra object/parts annotations and attention-based methods.
机译:由于难以定位不同子类别的鉴别区域,细粒度的图像分类是一个具有挑战性的任务。以前的作品主要专注于使用手动注释或注意算法本地化这些区域,这在实际应用中苛刻和复杂。本文提出了一种使用多级注意模型(MLA-CNN)的方法,该模型(MLA-CNN)已经接受过全尺寸图像系列的当前任务组,以定位最辨别的区域。直观地,为多级注意图选择了三种典型的接受场尺寸。然后,引入多级词典学习以从这些局部区域提取歧视特征。我们的方法探讨了关于如何使用神经激活来生成多尺度区域的新思考,这有助于细粒度分类。该方法可以分两步实现。第一步是选择所选三个特征映射中具有最大激活的神经元。这些特征映射是通过将全尺寸图像馈送到模型中的预训练的CNN模型的输出。然后,根据所选神经元的接受场大小来产生辨别区域。第二步是用这些多尺度区域训练微妙的网络。一个缩放的鉴别区域可以被视为一个典型的字典特征。然后这些结果被整合用于最终预测。我们在三个具有挑战性的细粒度图像数据集,幼崽-200-2011,斯坦福狗和斯坦福汽车上评估我们的方法。实验结果表明,我们的方法始于许多最先进的方法,使用额外的对象/零件注释和基于注意的方法。

著录项

  • 来源
    《Neurocomputing》 |2021年第17期|403-412|共10页
  • 作者单位

    Shanghai Jiao Tong Univ Shanghai Key Lab Intelligent Sensing & Recognit Shanghai 200240 Peoples R China;

    Tongji Univ Coll Surveying & Geoinformat Shanghai 200092 Peoples R China;

    Jiangxi Sci & Technol Normal Univ Sch Commun & Elect Nanchang 330013 Jiangxi Peoples R China;

    Tsinghua Univ Dept Elect Engn Beijing 100084 Peoples R China;

    Shandong Univ Sch Software Jinan 250101 Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Fine-grained; Visual attention; Multi-level; Pre-trained; Dictionary learning;

    机译:细粒度;视觉注意;多级;预培训;字典学习;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号