首页> 外文会议>International Conference on Pattern Recognition >Beyond the Deep Metric Learning: Enhance the Cross-Modal Matching with Adversarial Discriminative Domain Regularization

【24h】

Beyond the Deep Metric Learning: Enhance the Cross-Modal Matching with Adversarial Discriminative Domain Regularization

机译：超越深度度量学习：增强与对抗鉴别域正则化的跨模型匹配

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Matching information across image and text modalities is a fundamental challenge for many applications that involve both vision and natural language processing. The objective is to find efficient similarity metrics to compare the similarity between visual and textual information. Existing approaches mainly match the local visual objects and the sentence words in a shared space with attention mechanisms. The matching performance is still limited because the similarity computation is based on simple comparisons of the matching features, ignoring the characteristics of their distribution in the data. In this paper, we address this limitation with an efficient learning objective that considers the discriminative feature distributions between the visual objects and sentence words. Specifically, we propose a novel Adversarial Discriminative Domain Regularization (ADDR) learning framework, beyond the paradigm metric learning objective, to construct a set of discriminative data domains within each image-text pairs. Our approach can generally improve the learning efficiency and the performance of existing metrics learning frameworks by regulating the distribution of the hidden space between the matching pairs. The experimental results show that this new approach significantly improves the overall performance of several popular cross-modal matching techniques (SCAN [13], VSRN [14], BFAN [15]) on the MS-COCO and Flickr30K benchmarks.

机译：跨图像和文本方式的匹配信息对于许多涉及视觉和自然语言处理的许多应用程序是一个根本的挑战。目标是找到有效的相似性指标，以比较视觉和文本信息之间的相似性。现有方法主要与本地视觉对象和共享空间中的句子单词与注意机制匹配。匹配性能仍然有限，因为相似性计算是基于匹配功能的简单比较，忽略其在数据中分发的特征。在本文中，我们通过高效学习目标来解决这些限制，该限制考虑了视觉对象和句子词之间的鉴别特征分布。具体地，我们提出了一种新的对抗鉴别域正规化（Addr）学习框架，超出范式公制学习目标，以在每个图像文本对中构造一组判别数据域。我们的方法通常可以通过调节匹配对之间的隐藏空间的分布来提高现有度量学习框架的学习效率和性能。实验结果表明，这种新方法显着提高了MS-Coco和FlickR30K基准测试的几种流行的跨模型匹配技术的整体性能（扫描[13]，VSRN [14]，BFAN [15]）。

著录项

来源
《International Conference on Pattern Recognition》|2021年|10165-10172|共8页
会议地点
作者
Li Ren; Kai Li; LiQiang Wang; Kien Hua;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Measurement; Location awareness; Visualization; Semantics; Benchmark testing; Information retrieval; Natural language processing;

机译：测量;位置意识;可视化;语义;基准测试;信息检索;自然语言处理;

相似文献

外文文献
中文文献
专利

1. Deep adversarial metric learning for cross-modal retrieval [J] . Xu Xing, He Li, Lu Huimin, World Wide Web . 2019,第2期

机译：跨模式检索的深度对抗度量学习
2. Deep Coupled Metric Learning for Cross-Modal Matching [J] . Venice Erin Liong, Jiwen Lu, Yap-Peng Tan, IEEE transactions on multimedia . 2017,第6期

机译：跨模态匹配的深度耦合度量学习
3. Adversarial Discriminative Active Deep Learning for Domain Adaptation in Hyperspectral Images Classification [J] . Saboori Arash, Ghassemian Hassan International journal of remote sensing . 2021,第9a10期

机译：对极谱图像分类域适应域适应的对抗鉴别性积极深度学习
4. Cross-modal deep metric learning with multi-task regularization [C] . Xin Huang, Yuxin Peng IEEE International Conference on Multimedia and Expo . 2017

机译：具有多任务正则化的跨模式深度度量学习
5. Deep Adversarial Learning Based Domain Adaptation for Mulit-Modal Image Analysis [D] . Makkar, Nikhil. 2018

机译：基于深层的逆势学习的Mulit模态图像分析域改编
6. Audio-Based Drone Detection and Identification Using Deep Learning Techniques with Dataset Enhancement through Generative Adversarial Networks [O] . Sara Al-Emadi, Abdulla Al-Ali, Abdulaziz Al-Ali 2021

机译：基于音频的无人机检测和使用DataSet通过生成对冲网络使用DateSet增强的识别
7. Beyond the Deep Metric Learning: Enhance the Cross-Modal Matching with Adversarial Discriminative Domain Regularization [O] . Li Ren, Kai Li, LiQiang Wang, 2021

机译：超越深度度量学习：增强与对抗鉴别域正则化的跨模型匹配

Beyond the Deep Metric Learning: Enhance the Cross-Modal Matching with Adversarial Discriminative Domain Regularization

摘要

著录项

相似文献

相关主题

期刊订阅