首页> 外文会议>International Conference on Computer Vision >ACMM: Aligned Cross-Modal Memory for Few-Shot Image and Sentence Matching
【24h】

ACMM: Aligned Cross-Modal Memory for Few-Shot Image and Sentence Matching

机译:ACMM:对对齐的跨模型内存用于几张图像和句子匹配

获取原文

摘要

Image and sentence matching has drawn much attention recently, but due to the lack of sufficient pairwise data for training, most previous methods still cannot well associate those challenging pairs of images and sentences containing rarely appeared regions and words, i.e., few-shot content. In this work, we study this challenging scenario as few-shot image and sentence matching, and accordingly propose an Aligned Cross-Modal Memory (ACMM) model to memorize the rarely appeared content. Given a pair of image and sentence, the model first includes an aligned memory controller network to produce two sets of semantically-comparable interface vectors through cross-modal alignment. Then the interface vectors are used by modality-specific read and update operations to alternatively interact with shared memory items. The memory items persistently memorize cross-modal shared semantic representations, which can be addressed out to better enhance the representation of few-shot content. We apply the proposed model to both conventional and few-shot image and sentence matching tasks, and demonstrate its effectiveness by achieving the state-of-the-art performance on two benchmark datasets.
机译:图片和句子匹配已引起广泛关注最近,但由于缺乏足够的成对的数据进行训练,大部分以前的方法还不能很好的挑战对含有很少出现的区域和词,即几拍的内容图像和句子的关联。在这项工作中,我们将这一具有挑战性的场景作为少量图像和句子匹配研究,因此提出了一个对齐的跨模型存储器(ACMM)模型来记住很少出现的内容。给定一对图像和句子,该模型首先包括对准的存储器控​​制器网络,通过跨模型对准来产生两组语义上比较接口矢量。然后,界面向量被模态的读取和更新操作使用,以与共享存储器项交替交互。内存项目持久地记住跨模型共享语义表示,可以解决,以便更好地增强几次拍摄内容的表示。我们将建议的模型应用于传统和少量图像和句子匹配任务,并通过在两个基准数据集上实现最先进的性能来展示其有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号