首页> 外文会议>International Conference on Neural Information Processing >Learning Joint Multimodal Representation Based on Multi-fusion Deep Neural Networks
【24h】

Learning Joint Multimodal Representation Based on Multi-fusion Deep Neural Networks

机译:基于多融合深神经网络的学习联合多模式表示

获取原文

摘要

Recently, learning joint representation of multimodal data has received more and more attentions. Multimodal features are concept-level compositive features which are more effective than those single-modality features. Most existing methods only mine interactions between modalities on the top of their networks for one time to learn multi-modal representation. In this paper, we propose a multi-fusion deep learning framework which learns multimodal features richer in semantic. The framework sets multiple fusing points in different level of feature spaces, and then integrates and passes the fusing information step by step from the low level to higher levels. Moreover, we propose a multi-channel decoding network with alternate fine-tuning strategy to fully mine the modality-specific information and cross-modality correlations. We are also the first to introduce deep learning features into multimodal deep learning, alleviating the semantic and statistical property differences between modalities to learn better features. Extensive experiments on real-world datasets demonstrate that, our proposed method achieves superior performance compared with the state-of-the-art methods.
机译:最近,学习多式联数据的联合代表已经收到了越来越多的注意。多模式特征是概念级集成功能,其比单片式功能更有效。大多数现有方法只有在其网络顶部的模态之间的相互作用一次,一次学习多模态表示。在本文中,我们提出了一种多融合的深度学习框架,它学习了语义中的多模式特征富裕。该框架在不同级别的特征空间中设置多个定影点,然后集成并将融合信息从低电平到更高的级别传递并传递融合信息。此外,我们提出了一种多通道解码网络,其具有替代的微调策略来完全挖掘特定的模态信息和跨模态相关性。我们也是第一个将深度学习功能引入多模式深度学习,减轻了模型之间的语义和统计财产差异,以了解更好的功能。关于现实世界数据集的广泛实验表明,与最先进的方法相比,我们所提出的方法实现了卓越的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号