...
首页> 外文期刊>ACM transactions on multimedia computing communications and applications >CM-GANs: Cross-modal Generative Adversarial Networks for Common Representation Learning
【24h】

CM-GANs: Cross-modal Generative Adversarial Networks for Common Representation Learning

机译:CM-GAN:用于共同表示学习的跨模式生成对抗网络

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

It is known that the inconsistent distributions and representations of different modalities, such as image and text, cause the heterogeneity gap, which makes it very challenging to correlate heterogeneous data and measure their similarities. Recently, generative adversarial networks (GANs) have been proposed and have shown their strong ability to model data distribution and learn discriminative representation. It has also been shown that adversarial learning can be fully exploited to learn discriminative common representations for bridging the heterogeneity gap. Inspired by this, we aim to effectively correlate large-scale heterogeneous data of different modalities with the power of GANs to model cross-modal joint distribution. In this article, we propose Cross-modal Generative Adversarial Networks (CM-GANs) with the following contributions. First, a cross-modal GAN architecture is proposed to model joint distribution over the data of different modalities. The inter-modality and intra-modality correlation can be explored simultaneously in generative and discriminative models. Both compete with each other to promote cross-modal correlation learning. Second, the cross-modal convolutional autoencoders with weight-sharing constraint are proposed to form the generative model They not only exploit the cross-modal correlation for learning the common representations but also preserve reconstruction information for capturing the semantic consistency within each modality. Third, a cross-modal adversarial training mechanism is proposed, which uses two kinds of discriminative models to simultaneously conduct intra-modality and inter-modality discrimination. They can mutually boost to make the generated common representations more discriminative by the adversarial training process. In summary, our proposed CM-GAN approach can use GANs to perform cross-modal common representation learning by which the heterogeneous data can be effectively correlated. Extensive experiments are conducted to verify the performance of CM-GANs on cross-modal retrieval compared with 13 state-of-the-art methods on 4 cross-modal datasets.
机译:众所周知,不同形式(例如图像和文本)的不一致分布和表示形式会导致异质性差距,这使得关联异质数据并测量其相似性非常具有挑战性。最近,已经提出了生成对抗网络(GAN),并显示出其强大的建模数据分布和学习判别表示的能力。还显示出可以充分利用对抗学习来学习区分性的通用表示形式,以弥合异质性差距。受此启发,我们的目标是有效地将不同模式的大规模异构数据与GAN的能力相关联,以建模跨模式联合分布。在本文中,我们提出了具有以下贡献的跨模式生成对抗网络(CM-GAN)。首先,提出了一种跨模式GAN架构,以对不同模式的数据上的联合分布进行建模。模态之间和模态之间的相关性可以在生成模型和判别模型中同时探索。两者相互竞争以促进跨模式相关学习。其次,提出了具有权重共享约束的跨模态卷积自编码器,以形成生成模型。它们不仅利用跨模态相关性学习通用表示,而且保留重构信息以捕获每个模态内的语义一致性。第三,提出了一种跨模式的对抗训练机制,该机制利用两种判别模型同时进行模态内和模态间的区分。它们可以相互促进,以通过对抗训练过程使生成的共同表示更具区分性。总而言之,我们提出的CM-GAN方法可以使用GAN来执行跨模式通用表示学习,通过该学习可以有效地关联异构数据。进行了广泛的实验,以验证CM-GAN在交叉模式检索中的性能,与在4个交叉模式数据集上使用的13种最新方法相比。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号