首页> 外文会议>International Conference on Multimedia Modeling >Multi-hop Interactive Cross-Modal Retrieval
【24h】

Multi-hop Interactive Cross-Modal Retrieval

机译:多跳交互式跨模态检索

获取原文

摘要

Conventional representation learning based cross-modal retrieval approaches always represent the sentence with a global embedding feature, which easily neglects the local correlations between objects in the image and phrases in the sentence. In this paper, we present a novel Multi-hop Interactive Cross-modal Retrieval Model (MICRM), which interactively exploits the local correlations between images and words. We design a multi-hop interactive module to infer the high-order relevance between the image and the sentence. Experimental results on two benchmark datasets, MS-COCO and Flickr30K, demonstrate that our multi-hop interactive model performs significantly better than several competitive cross-modal retrieval methods.
机译:基于常规表示学习的跨模态检索方法始终使用全局嵌入功能来表示句子,这很容易忽略了图像中的对象与句子中的短语之间的局部相关性。在本文中,我们提出了一种新颖的多跳交互式跨模态检索模型(MICRM),该模型以交互方式利用了图像和单词之间的局部相关性。我们设计了一个多跳互动模块来推断图像和句子之间的高阶相关性。在两个基准数据集MS-COCO和Flickr30K上的实验结果表明,我们的多跳交互式模型的性能明显优于几种竞争性的跨模式检索方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号