...
首页> 外文期刊>Neurocomputing >Dynamic interaction networks for image-text multimodal learning
【24h】

Dynamic interaction networks for image-text multimodal learning

机译:用于图文多模态学习的动态交互网络

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Recently, there is a surge of interest in image-text multimodal representation learning, and many neural network based models have been proposed aiming to capture the interaction between two modalities with different forms of functions. Despite their success, a potential limitation of these methods is insufficient to model all kinds of interactions with a set of static parameters. To alleviate this problem, we present a dynamic interaction network, in which the parameters of the interaction function are dynamically generated by a meta network. Additionally, to provide necessary multimodal features that the meta network needs, we propose a new neural module called Multimodal Transformer. Experimentally, we not only make a comprehensively quantitative evaluation on four image-text tasks, but also show some interpretable analyses of our models, revealing the internal working mechanism of the dynamic parameter learning. (C) 2019 Elsevier B.V. All rights reserved.
机译:近来,对图像-文本多模式表示学习的兴趣激增,并且已经提出了许多基于神经网络的模型,旨在捕获具有不同形式功能的两种模式之间的相互作用。尽管取得了成功,但这些方法的潜在局限性不足以对具有一组静态参数的所有交互建模。为了缓解这个问题,我们提出了一种动态交互网络,其中交互功能的参数是由元网络动态生成的。此外,为了提供元网络所需的必要多峰功能,我们提出了一种新的神经模块,称为多峰变压器。通过实验,我们不仅对四个图像文本任务进行了全面的定量评估,而且还对模型进行了一些可解释的分析,揭示了动态参数学习的内部工作机制。 (C)2019 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号