首页> 外文会议>Asian conference on computer vision >In-sample Contrastive Learning and Consistent Attention for Weakly Supervised Object Localization
【24h】

In-sample Contrastive Learning and Consistent Attention for Weakly Supervised Object Localization

机译:在样本对比学习和一致关注弱势监督的对象本地化

获取原文

摘要

Weakly supervised object localization (WSOL) aims to localize the target object using only the image-level supervision. Recent methods encourage the model to activate feature maps over the entire object by dropping the most discriminative parts. However, they are likely to induce excessive extension to the backgrounds which leads to overestimated localization. In this paper, we consider the background as an important cue that guides the feature activation to cover the sophisticated object region and propose contrastive attention loss. The loss promotes similarity between foreground and its dropped version, and, dissimilarity between the dropped version and background. Furthermore, we propose foreground consistency loss that penalizes earlier layers producing noisy attention regarding the later layer as a reference to provide them with a sense of backgroundness. It guides the early layers to activate on objects rather than locally distinctive backgrounds so that their attentions to be similar to the later layer. For better optimizing the above losses, we use the non-local attention blocks to replace channel-pooled attention leading to enhanced attention maps considering the spatial similarity. Last but not least, we propose to drop background regions in addition to the most discriminative region. Our method achieves state-of-the-art performance on CUB-200-2011 and ImageNet benchmark datasets regarding top-1 localization accuracy and MaxBoxAccV2, and we provide detailed analysis on our individual components. The code will be publicly available online for reproducibility.
机译:弱监督对象本地化(WSOL)旨在仅使用图像级监控本地化目标对象。最近的方法鼓励模型通过删除最辨别的部分来激活整个物体上的特征映射。然而,它们可能会引起背景的过度延伸,这导致过高估计的本地化。在本文中,我们认为背景是指导特征激活以覆盖复杂的对象区域并提出对比的注意力损失的重要提示。损失促进了前景与其丢弃的版本之间的相似性,以及丢弃的版本和背景之间的异化。此外,我们提出了前景一致性损失,以惩罚早期的层,这些层产生嘈杂的关注,就后来的层作为参考,以便为它们提供背景感。它引导早期的层在物体上激活,而不是局部独特的背景,以便他们的注意与后来的层相似。为了更好地优化上述损失,我们使用非本地注意力块来取代通道池的注意力,导致考虑空间相似度的增强注意图。最后但并非最不重要的是,除了最歧视的地区之外,我们还提出了下降背景区域。我们的方法在Cub-200-2011和Imagenet基准数据集上实现了最先进的性能,就前面1个本地化准确性和MaxBoxaccv2,我们对我们的各个组件提供了详细的分析。代码将在线公开可用以进行再现性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号