Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models

机译：外观，想象和匹配：使用生成模型改进文本视觉跨模态检索

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Textual-visual cross-modal retrieval has been a hot research topic in both computer vision and natural language processing communities. Learning appropriate representations for multi-modal data is crucial for the cross-modal retrieval performance. Unlike existing image-text retrieval approaches that embed image-text pairs as single feature vectors in a common representational space, we propose to incorporate generative processes into the cross-modal feature embedding, through which we are able to learn not only the global abstract features but also the local grounded features. Extensive experiments show that our framework can well match images and sentences with complex content, and achieve the state-of-the-art cross-modal retrieval results on MSCOCO dataset.

机译：文本视觉跨模式检索已成为计算机视觉和自然语言处理社区中的热门研究主题。学习多模式数据的适当表示形式对于跨模式检索性能至关重要。与现有的将图像-文本对作为单个特征向量嵌入公共表示空间中的现有图像-文本检索方法不同，我们建议将生成过程纳入交叉模式特征嵌入中，通过该过程，我们不仅可以学习全局抽象特征而且还有本地接地功能。大量的实验表明，我们的框架可以很好地匹配具有复杂内容的图像和句子，并在MSCOCO数据集上获得最新的交叉模式检索结果。

著录项

来源
《IEEE/CVF Conference on Computer Vision and Pattern Recognition》|2018年|7181-7189|共9页
会议地点 Salt Lake City(US)
作者
Jiuxiang Gu; Jianfei Cai; Shafiq Joty; Li Niu; Gang Wang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Visualization; Training; Decoding; Semantics; Measurement; Xenon; Computer vision;

机译：可视化；训练;解码;语义学测量;氙;计算机视觉;
入库时间 2022-08-26 14:35:32

相似文献

外文文献
中文文献
专利

1. Multi-Pathway Generative Adversarial Hashing for Unsupervised Cross-Modal Retrieval [J] . IEEE transactions on multimedia . 2020,第1期

机译：无监督跨模态检索的多路径生成对抗性哈希
2. Modality-specific and shared generative adversarial network for cross-modal retrieval [J] . Pattern Recognition: The Journal of the Pattern Recognition Society . 2020,第期

机译：用于跨模型检索的模态和共享生成的对抗性网络
3. Cycle-Consistent Deep Generative Hashing for Cross-Modal Retrieval [J] . Lin Wu, Yang Wang, Ling Shao IEEE Transactions on Image Processing . 2019,第4期

机译：用于跨模态检索的周期一致的深度生成散列
4. Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models [C] . Jiuxiang Gu, Jianfei Cai, Shafiq Joty, IEEE/CVF Conference on Computer Vision and Pattern Recognition . 2018

机译：看起来和匹配：通过生成模型提高文本视觉跨模型检索
5. Generative models for retrieval of video, audio and text data. [D] . Velivelli, Atulya. 2010

机译：用于检索视频，音频和文本数据的生成模型。
6. Generative Retrieval Does Not Improve Long-Term Retention of Regional Anesthesia Ultrasound Anatomy in Unengaged Learners [O] . Jennifer F. Potter, Amanda M. Kleiman, Emmarie G. Myers, 2019

机译：生成检索不能改善未参与学习者的区域麻醉超声解剖学的长期保留。
7. Improving What Cross-Modal Retrieval Models Learn through Object-Oriented Inter- and Intra-Modal Attention Networks [O] . Po-Yao Huang, Xiaojun Chang, Alexander G. Hauptmann 2019

机译：通过面向对象和模态关注网络来改进跨模型检索模型的内容
8. Improved Surface Wave Detection and Measurement Using Phase-Matched Filtering and Improved Regionalized Models [R] . Stevens, J. L., Adams, D. A. 2000

机译：利用相位匹配滤波和改进的区域化模型改进表面波检测和测量

Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models

摘要

著录项

相似文献

相关主题

期刊订阅