Re-ranking image-text matching by adaptive metric fusion

首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >Re-ranking image-text matching by adaptive metric fusion

【24h】

Re-ranking image-text matching by adaptive metric fusion

机译：通过自适应度量融合重新排名图像文本匹配

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Image-text matching has drawn much attention recently with the rapid growth of multi-modal data. Many effective approaches have been proposed to solve this challenging problem, but limited effort has been devoted to re-ranking methods. Compared with the uni-modal re-ranking methods, modality heterogeneity is the major difficulty when designing a re-ranking method in the cross-modal field, which mainly lies in two aspects of different visual and textual feature spaces and different distributions in inverse directions. In this paper, we propose a heuristic re-ranking method called Adaptive Metric Fusion (AMF) for image-text matching. The method can obtain a better metric by adaptively fusing metrics based on two modules: 1) Cross-modal Reciprocal Encoding, which considers ranks in inverse directions to comprehensively evaluate a metric. The sentence retrieval and image retrieval have different distribution characteristics and galleries in different modalities, thus it is necessary to exploit them simultaneously for appropriate metric fusion. 2) Query Replacement Gap, which quantifies the gap between cross-modal and uni-modal similarities to alleviate the influence of different visual and textual feature spaces on the fused metric. The proposed re-ranking method can be implemented in an unsupervised way without requiring any human interaction or annotated data, and can be easily applied to any initial ranking result. Extensive experiments and analysis validate the effectiveness of our method on the large-scale MS-COCO and Flickr30K datasets. (C) 2020 Elsevier Ltd. All rights reserved.

机译：图像 - 文本匹配最近绘制了多模态数据的快速增长。已经提出了许多有效的方法来解决这一具有挑战性的问题，但有限的努力已经致力于重新排名方法。与单模重新排名方法相比，模态异质性是在跨模型字段中设计重新排名方法时的主要困难，这主要位于不同视觉和文本特征空间的两个方面和逆方向的不同分布。在本文中，我们提出了一种称为自适应度量融合（AMF）的启发式重新排序方法，用于图像文本匹配。该方法可以通过基于两个模块的自适应熔断度量获得更好的度量：1）跨模型互换编码，其认为以逆向的秩序来全面评估度量。句子检索和图像检索具有不同模式的不同分布特征和画廊，因此有必要同时利用它们进行适当的公制融合。 2）查询替换差距，这量化了跨模型和单模体相似性之间的间隙，以减轻不同视觉和文本特征空间对融合度量的影响。所提出的重新排序方法可以以无监督的方式实现，而无需任何人类交互或注释数据，并且可以很容易地应用于任何初始排名结果。广泛的实验和分析验证了我们对大规模MS-Coco和Flickr30k数据集的方法的有效性。（c）2020 elestvier有限公司保留所有权利。

著录项

来源
《Pattern Recognition: The Journal of the Pattern Recognition Society》 |2020年第2020期|共13页
作者

展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
Image-text matching; Re-ranking method; Adaptive metric fusion;

机译：图像文本匹配;重新排名方法;自适应度量融合;

相似文献

外文文献
中文文献
专利

1. Re-ranking image-text matching by adaptive metric fusion [J] . Pattern Recognition: The Journal of the Pattern Recognition Society . 2020,第期

机译：通过自适应度量融合重新排名图像文本匹配
2. Multiple Objects Fusion Tracker Using a Matching Network for Adaptively Represented Instance Pairs [J] . Sang-Il Oh, Hang-Bong Kang Sensors . 2017,第4期

机译：使用匹配网络的自适应表示实例对的多对象融合跟踪器
3. Robust object tracking based on adaptive templates matching via the fusion of multiple features [J] . Li Zhiyong, Gao Song, Nai Ke Journal of visual communication & image representation . 2017,第APRa期

机译：通过融合多种功能，基于自适应模板匹配进行鲁棒的对象跟踪
4. Combine Early and Late Fusion Together: A Hybrid Fusion Framework for Image-Text Matching [C] . Yifan Wang, Xing Xu, Wei Yu, IEEE International Conference on Multimedia and Expo . 2021

机译：早期和晚期融合在一起结合在一起：用于图像文本匹配的混合融合框架
5. Domain-Specific Knowledge Exploration with Ontology Hierarchical Re-ranking and Adaptive Learning and Extension [D] . Zhao, Grace. 2018

机译：具有本体层次重新排序以及自适应学习和扩展的特定领域知识探索
6. Multiple Objects Fusion Tracker Using a Matching Network for Adaptively Represented Instance Pairs [O] . Sang-Il Oh, Hang-Bong Kang 2017

机译：使用匹配网络的自适应表示实例对的多对象融合跟踪器
7. Combine Early and Late Fusion Together: A Hybrid Fusion Framework for Image-Text Matching [O] . Yifan Wang, Xing Xu, Wei Yu, 2021

机译：结合早期和晚期融合在一起：用于图像文本匹配的混合融合框架
8. Locally-Adaptive Metric for Contrast-Fusion of Noisy Multimodal Imagery [R] . Socolinsky, D. A. 2000

机译：用于噪声多模态图像对比度融合的局部自适应度量

Re-ranking image-text matching by adaptive metric fusion

摘要

著录项

相似文献

相关主题

期刊订阅