...
首页> 外文期刊>Quantum electronics >Ternary Adversarial Networks With Self-Supervision for Zero-Shot Cross-Modal Retrieval
【24h】

Ternary Adversarial Networks With Self-Supervision for Zero-Shot Cross-Modal Retrieval

机译:具有自我监督的三元对抗网络,用于零射频跨模型检索

获取原文
获取原文并翻译 | 示例

摘要

Given a query instance from one modality (e.g., image), cross-modal retrieval aims to find semantically similar instances from another modality (e.g., text). To perform cross-modal retrieval, existing approaches typically learn a common semantic space from a labeled source set and directly produce common representations in the learned space for the instances in a target set. These methods commonly require that the instances of both two sets share the same classes. Intuitively, they may not generalize well on a more practical scenario of zero-shot cross-modal retrieval, that is, the instances of the target set contain unseen classes that have inconsistent semantics with the seen classes in the source set. Inspired by zero-shot learning, we propose a novel model called ternary adversarial networks with self-supervision (TANSS) in this paper, to overcome the limitation of the existing methods on this challenging task. Our TANSS approach consists of three paralleled subnetworks: 1) two semantic feature learning subnetworks that capture the intrinsic data structures of different modalities and preserve the modality relationships via semantic features in the common semantic space; 2) a self-supervised semantic subnetwork that leverages the word vectors of both seen and unseen labels as guidance to supervise the semantic feature learning and enhances the knowledge transfer to unseen labels; and 3) we also utilize the adversarial learning scheme in our TANSS to maximize the consistency and correlation of the semantic features between different modalities. The three subnetworks are integrated in our TANSS to formulate an end-to-end network architecture which enables efficient iterative parameter optimization. Comprehensive experiments on three cross-modal datasets show the effectiveness of our TANSS approach compared with the state-of-the-art methods for zero-shot cross-modal retrieval.
机译:给定来自一个模态(例如,图像)的查询实例,跨模板检索旨在从另一个模态找到语义类似的实例(例如,文本)。为了执行跨模态检索,现有方法通常从标记的源集中学习公共语义空间,并直接在目标集中的实例中直接生成学习空间中的公共表示。这些方法通常要求两组的实例共享相同的类。直观地,它们可能不会概括零拍摄跨模型检索的更实际场景,即目标集的实例包含具有源集中所看到的类的语义具有不一致的语义的看不见的类。灵感来自零射击学习,我们提出了一款名为Ternary Profersarial Networks的新型型号,本文中具有自我监督(陈列),克服了对这项挑战性任务的现有方法的限制。我们的味道方法由三个并联子网组成:1)两个语义特征学习子网,用于捕获不同模式的内在数据结构,并通过公共语义空间中的语义特征来保留模态关系; 2)一个自我监督的语义子网,利用看到和看不见标签的单词向量作为监督语义特征学习的指导,并增强知识转移到看不见的标签; 3)我们还利用了我们蛋白的对抗性学习计划来最大限度地提高不同模式之间的语义特征的一致性和相关性。三个子网集成在我们的营件中,以制定能够高效的迭代参数优化的端到端网络架构。与三个跨模型数据集的综合实验表明,与最先进的零射频检索的最新方法相比,我们的蛋白曲线方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号