首页> 外文会议>International Conference on Digital Image Computing: Techniques and Applications >Image Descriptors from ConvNets: Comparing Global Pooling Methods for Image Retrieval
【24h】

Image Descriptors from ConvNets: Comparing Global Pooling Methods for Image Retrieval

机译:ConvNets的图像描述符:比较全局池检索方法

获取原文

摘要

A major component of a generic image retrieval pipeline is producing concise and effective descriptors for each image. Previous works have shown impressive results in image retrieval when using descriptors from the black-box output of the fully-connected stage of pretrained Convolutional Neural Networks (ConvNets). However, previous work on descriptors pooled from the deep feature maps from late convolutional layers can produce more discriminative descriptors for generic image retrieval, while being relatively concise. When planning to globally pool such feature maps from a ConvNet, some options to consider are (1) the depth of the network, (2) choice of layer to pool, and (3) the level of dimension reduction. The previous work on global pooling methods uses differing techniques without a clear consensus on which method is best. This motivates us to establish a baseline pipeline from which to compare these options and their effect on retrieval results. Our contribution is a systematic and comprehensive experimental study of different pooling strategies of deep features for image retrieval, and the various options. Our results show that the nature of the dataset (object- heavy or scene-heavy) warrants a different pooling strategy. Significantly, we visualise the level of image discrimination brought by the different pooling methods on the datasets, and show that pooling need not have a priori spatial weights to effectively find objects within the image. The results underline the need to consider the context of the image dataset when developing image retrieval pipelines using ConvNets.
机译:通用图像检索管道的主要组成部分为每个图像产生简明且有效的描述符。以前的作品在使用从普雷雷卷曲的卷积神经网络(CoundNets)的完全连接阶段的黑匣子输出中使用描述符时,在图像检索中显示了令人印象深刻的结果。然而,从晚卷积层的深度特征映射汇总的描述符的上一部工作可以为通用图像检索产生更多辨别性描述符,同时相对简明。当计划到全局池中从GromNet中映射出这样的特征映射时,需要考虑的某些选项是(1)网络的深度,(2)层选择到池,(3)尺寸减小水平。以前的全球汇集方法的工作使用不同的技术,没有明确共识,方法是最佳的。这使我们能够建立基线管道,从中比较这些选项和它们对检索结果的影响。我们的贡献是对图像检索的不同汇集策略的系统和全面的实验研究,以及各种选择。我们的结果表明,数据集的性质(对象沉重或场景)保证了不同的汇集策略。值得注意的是,我们可视化不同汇集方法在数据集上所带来的图像辨别水平,并显示汇集不需要具有先验的空间权重,以有效地查找图像内的对象。结果强调了使用CUMMNET开发图像检索管道时考虑图像数据集的上下文。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号