首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Semi-Supervised Multi-Modal Multi-Instance Multi-Label Deep Network with Optimal Transport
【24h】

Semi-Supervised Multi-Modal Multi-Instance Multi-Label Deep Network with Optimal Transport

机译:半监控多模态多实例多标签深网络,最佳运输

获取原文
获取原文并翻译 | 示例
       

摘要

Complex objects are usually with multiple labels, and can be represented by multiple modal representations, e.g., the complex articles contain text and image information as well as multiple annotations. Previous methods assume that the homogeneous multi-modal data are consistent, while in real applications, the raw data are disordered, e.g., the article constitutes with variable number of inconsistent text and image instances. Therefore, Multi-modal Multi-instance Multi-label (M3) learning provides a framework for handling such task and has exhibited excellent performance. However, M3 learning is facing two main challenges: 1) how to effectively utilize label correlation and 2) how to take advantage of multi-modal learning to process unlabeled instances. To solve these problems, we first propose a novel Multi-modal Multi-instance Multi-label Deep Network (M3DN), which considers M3 learning in an end-to-end multi-modal deep network and utilizes consistency principle among different modal bag-level predictions. Based on the M3DN, we learn the latent ground label metric with the optimal transport. Moreover, we introduce the extrinsic unlabeled multi-modal multi-instance data, and propose the M3DNS, which considers the instance-level auto-encoder for single modality and modified bag-level optimal transport to strengthen the consistency among modalities. Thereby M3DNS can better predict label and exploit label correlation simultaneously. Experiments on benchmark datasets and real world WKG Game-Hub dataset validate the effectiveness of the proposed methods.
机译:复杂对象通常具有多个标签,并且可以由多个模态表示表示,例如,复杂的文章包含文本和图像信息以及多个注释。以前的方法假设均匀的多模态数据是一致的,而在实际应用中,原始数据是混乱的,例如,文章构成了可变数量的不一致文本和图像实例。因此,多模态多实例多标签(M3)学习为处理此类任务提供了框架,并且表现出优异的性能。然而,M3学习面临两个主要挑战:1)如何有效地利用标签相关和2)如何利用多模态学习来处理未标记的实例。为了解决这些问题,我们首先提出了一种新的多模态多实例多标签深网络(M3DN),其在端到端的多模态深网络中考虑M3学习,并在不同的模态袋中利用一致性原理 - 水平预测。基于M3DN,我们学习了最佳运输的潜在地面标签度量。此外,我们介绍了外在的未标记的多模态多实例数据,并提出了M3DNS,其考虑了用于单个模态的实例级自动编码器,并修改后的袋级最佳传输,以增强模态之间的一致性。因此,M3DN可以更好地预测标签并同时利用标签相关性。基准数据集和现实世界WKG游戏 - 集线器数据集的实验验证了所提出的方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号