首页> 外文会议>IEEE/CVF Conference on Computer Vision and Pattern Recognition >Solving Mixed-Modal Jigsaw Puzzle for Fine-Grained Sketch-Based Image Retrieval
【24h】

Solving Mixed-Modal Jigsaw Puzzle for Fine-Grained Sketch-Based Image Retrieval

机译:解决混合模态拼图以细粒度基于草图的图像检索

获取原文

摘要

ImageNet pre-training has long been considered crucial by the fine-grained sketch-based image retrieval (FG-SBIR) community due to the lack of large sketch-photo paired datasets for FG-SBIR training. In this paper, we propose a self-supervised alternative for representation pre-training. Specifically, we consider the jigsaw puzzle game of recomposing images from shuffled parts. We identify two key facets of jigsaw task design that are required for effective FG-SBIR pre-training. The first is formulating the puzzle in a mixed-modality fashion. Second we show that framing the optimisation as permutation matrix inference via Sinkhorn iterations is more effective than the common classifier formulation of Jigsaw self-supervision. Experiments show that this self-supervised pre-training strategy significantly outperforms the standard ImageNet-based pipeline across all four product-level FG-SBIR benchmarks. Interestingly it also leads to improved cross-category generalisation across both pre-train/fine-tune and fine-tune/testing stages.
机译:细粒度的基于草图的图像检索(FG-SBIR)社区长期以来一直将ImageNet预训练视为至关重要的,原因是缺少用于FG-SBIR训练的大的草图照片配对数据集。在本文中,我们提出了一种自我监督的表示预训练替代方法。具体来说,我们考虑的是一种拼图游戏,它可以根据混洗后的部分重新组成图像。我们确定了拼图任务设计的两个关键方面,它们是有效的FG-SBIR预训练所必需的。首先是以混合模式的方式制定难题。其次,我们表明,通过Sinkhorn迭代将优化框架化为置换矩阵推理比使用Jigsaw自监督的通用分类器公式更为有效。实验表明,这种自我监督的预培训策略在所有四个产品级FG-SBIR基准测试中均明显优于基于ImageNet的标准管道。有趣的是,它还可以改善预训练/微调和微调/测试阶段之间的跨类别泛化。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号