首页> 外文会议>IEEE/CVF Conference on Computer Vision and Pattern Recognition >Inflated Episodic Memory With Region Self-Attention for Long-Tailed Visual Recognition
【24h】

Inflated Episodic Memory With Region Self-Attention for Long-Tailed Visual Recognition

机译:具有区域自我注意的膨胀式情节记忆,用于长尾视觉识别

获取原文

摘要

There have been increasing interests in modeling long-tailed data. Unlike artificially collected datasets, long-tailed data are naturally existed in the real-world and thus more realistic. To deal with the class imbalance problem, we introduce an Inflated Episodic Memory (IEM) for long-tailed visual recognition. First, our IEM augments the convolutional neural networks with categorical representative features for rapid learning on tail classes. In traditional few-shot learning, a single prototype is usually leveraged to represent a category. However, long-tailed data has higher intra-class variances. It could be challenging to learn a single prototype for one category. Thus, we introduce IEM to store the most discriminative feature for each category individually. Besides, the memory banks are updated independently, which further decreases the chance of learning skewed classifiers. Second, we introduce a novel region self-attention mechanism for multi-scale spatial feature map encoding. It is beneficial to incorporate more discriminative features to improve generalization on tail classes. We propose to encode local feature maps at multiple scales, and the spatial contextual information should be aggregated at the same time. Equipped with IEM and region self-attention, we achieve state-of-the-art performance on four standard long-tailed image recognition benchmarks. Besides, we validate the effectiveness of IEM on a long-tailed video recognition benchmark, i.e., YouTube-8M.
机译:人们对长尾数据建模越来越感兴趣。与人工收集的数据集不同,长尾数据自然存在于现实世界中,因此更为真实。为了解决类不平衡问题,我们引入了膨胀式情节记忆(IEM),用于长尾视觉识别。首先,我们的IEM通过分类代表特征增强了卷积神经网络,以便快速学习尾巴类。在传统的一次性学习中,通常使用单个原型来表示类别。但是,长尾数据具有较高的类内方差。为一个类别学习单个原型可能是具有挑战性的。因此,我们引入IEM来分别存储每个类别的最具区别性的功能。此外,存储库是独立更新的,这进一步减少了学习倾斜分类器的机会。其次,我们介绍了一种用于多尺度空间特征图编码的新颖的区域自我关注机制。合并更多区分性特征以改进尾类的通用性是有益的。我们建议以多个比例对局部特征图进行编码,并且空间上下文信息应同时进行汇总。配备IEM和区域自我关注功能,我们在四个标准的长尾图像识别基准上实现了最先进的性能。此外,我们在长尾视频识别基准(即YouTube-8M)上验证了IEM的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号