Learning Joint Multimodal Representation Based on Multi-fusion Deep Neural Networks

机译：基于多融合深神经网络的学习联合多模式表示

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recently, learning joint representation of multimodal data has received more and more attentions. Multimodal features are concept-level compositive features which are more effective than those single-modality features. Most existing methods only mine interactions between modalities on the top of their networks for one time to learn multi-modal representation. In this paper, we propose a multi-fusion deep learning framework which learns multimodal features richer in semantic. The framework sets multiple fusing points in different level of feature spaces, and then integrates and passes the fusing information step by step from the low level to higher levels. Moreover, we propose a multi-channel decoding network with alternate fine-tuning strategy to fully mine the modality-specific information and cross-modality correlations. We are also the first to introduce deep learning features into multimodal deep learning, alleviating the semantic and statistical property differences between modalities to learn better features. Extensive experiments on real-world datasets demonstrate that, our proposed method achieves superior performance compared with the state-of-the-art methods.

机译：最近，学习多式联数据的联合代表已经收到了越来越多的注意。多模式特征是概念级集成功能，其比单片式功能更有效。大多数现有方法只有在其网络顶部的模态之间的相互作用一次，一次学习多模态表示。在本文中，我们提出了一种多融合的深度学习框架，它学习了语义中的多模式特征富裕。该框架在不同级别的特征空间中设置多个定影点，然后集成并将融合信息从低电平到更高的级别传递并传递融合信息。此外，我们提出了一种多通道解码网络，其具有替代的微调策略来完全挖掘特定的模态信息和跨模态相关性。我们也是第一个将深度学习功能引入多模式深度学习，减轻了模型之间的语义和统计财产差异，以了解更好的功能。关于现实世界数据集的广泛实验表明，与最先进的方法相比，我们所提出的方法实现了卓越的性能。

著录项

来源
《International Conference on Neural Information Processing》|2017年|926p|共10页
会议地点
作者
Zepeng Gu; Bo Lang; Tongyu Yue; Lei Huang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP183-53;
关键词
Multimodal; Deep learning; Multi-fusion; Semantic integration;

机译：多模式;深度学习;多融合;语义整合;

相似文献

外文文献
中文文献
专利

1. Segment spatial-temporal representation and cooperative learning of convolution neural networks for multimodal-based action recognition [J] . Ren Ziliang, Zhang Qieshi, Cheng Jun, Neurocomputing . 2021,第Apra14期

机译：段空间 - 时间代表与基于多式联动作识别的卷积神经网络的合作学习
2. A learning-based method for drug-target interaction prediction based on feature representation learning and deep neural network [J] . Jiajie Peng, Jingyi Li, Xuequn Shang BMC Bioinformatics . 2020,第S13期

机译：基于特征表示学习和深神经网络的药物目标交互预测的基于学习方法
3. Learning to recognise 3D human action from a new skeleton-based representation using deep convolutional neural networks [J] . Huy-Hieu Pham, Khoudour Louahdi, Crouzil Alain, Computer Vision, IET . 2019,第3期

机译：学习使用深度卷积神经网络从新的基于骨骼的表示中识别3D人类动作
4. Learning Joint Multimodal Representation Based on Multi-fusion Deep Neural Networks [C] . Zepeng Gu, Bo Lang, Tongyu Yue, International conference on neural information processing . 2017

机译：基于多融合深度神经网络的联合多模式表示学习
5. Fine-grained Visual Representation Learning with Deep Neural Networks [D] . Xu, Tao. 2018

机译：深度神经网络的细粒度视觉表示学习
6. A learning-based method for drug-target interaction prediction based on feature representation learning and deep neural network [O] . Jiajie Peng, Jingyi Li, Xuequn Shang 2020

机译：基于特征表示学习和深神经网络的药物目标交互预测的基于学习方法
7. Bidirectional Joint Representation Learning with Symmetrical Deep Neural Networks for Multimodal and Crossmodal Applications [O] . Vukotic, Vedran, Raymond, Christian, Gravier, Guillaume 2016

机译：对称联合深度神经网络的双向联合表示学习，用于多模式和交叉模式应用

Learning Joint Multimodal Representation Based on Multi-fusion Deep Neural Networks

摘要

著录项

相似文献

相关主题

期刊订阅