M3LA: A Novel Approach Based on Encoder-Decoder with Attention Framework for Multi-modal Multi-label Learning

机译：M3LA：一种基于编码器-解码器和注意力框架的多模式多标签学习新方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the exponential growth of digital multimedia resources, in the real-world, most of the data are represented as a multi-modal form and usually with multiple semantic labels. Nowadays, Multi-modal Multi-label learning has become a very hot topic. However, previous methods either have not considered the relation between modalities and labels or the correlation among labels. In this paper, we considered the following three questions: (1) How to model the correlation among labels? (2) Is there a correlation between modality and label? (3) Whether the modal input order affects the prediction of individual instance, and how to find the most appropriate modal input sequence for each instance? To solve above problems, we proposed a novel method for Multi-modal Multi-label learning(MMML), which based on Encoder-Decoder with attention framwork named MMML-Attention(M3LA). The M3LA takes into account all of these issues. Specifically, benefit from the Encoder-Decoder with attention structure, on the one hand, M3LA can model the relation between modalities and labels. On the other hand, we introduce a correlation matrix to learn the correlation among labels, which can be obtained as parameter through the training process. It should be mentioned that label prediction occurs at every step of the decoder, and the prediction of the label is constantly corrected and then the most accurate prediction is obtained. To validate the effectiveness of the proposed method, we expermiented on widely used several benchmark datasets and compared with state-of-art approaches.

机译：随着数字多媒体资源的指数增长，在现实世界中，大多数数据都表示为多模态形式，通常具有多个语义标签。如今，多模态多标签学习已成为一个非常热门的话题。然而，以前的方法还没有考虑模态和标签之间的关系或标签之间的相关性。在本文中，我们考虑了以下三个问题：（1）如何建模标签之间的相关性？（2）模态和标签之间是否存在相关性？（3）模态输入顺序是否会影响各个实例的预测，以及如何找到每个实例的最合适的模态输入序列？为了解决上述问题，我们提出了一种用于多模态多标签学习（MMML）的新方法，基于编码器 - 解码器具有名为MMML-Inctions的注意力框架（M3LA）。 M3LA考虑到所有这些问题。具体而言，从一个注意结构中受益于注意结构，一方面，M3LA可以模拟模态和标签之间的关系。另一方面，我们引入相关矩阵来学习标签之间的相关性，这可以通过训练过程获得作为参数。应该提到的是，在解码器的每个步骤中发生标签预测，并且不断校正标签的预测，然后获得最精确的预测。为了验证所提出的方法的有效性，我们介绍了广泛使用的几个基准数据集，并与最先进的方法进行比较。

著录项

来源
《International Joint Conference on Neural Networks》|2020年|1-8|共8页
会议地点
作者
Yinlong Zhu; Yi Zhang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Correlation; Training; Decoding; Machine learning; Predictive models; Semantics; Neural networks;

机译：相关;训练;解码;机器学习;预测模型;语义;神经网络;

相似文献

外文文献
中文文献
专利

1. Multivariate time series forecasting via attention-based encoder-decoder framework [J] . Neurocomputing . 2020,第May7期

机译：通过基于注意力的编解码器框架进行多元时间序列预测
2. AEDmts: An Attention-Based Encoder-Decoder Framework for Multi-Sensory Time Series Analytic [J] . Fan Jin, Wang Hongkun, Huang Yipan, Quality Control, Transactions . 2020,第期

机译：AEDMTS：用于多感觉时间序列分析的基于关注的编码器 - 解码器框架
3. Multi-Task Learning Using Attention-Based Convolutional Encoder-Decoder for Dilated Cardiomyopathy CMR Segmentation and Classification [J] . Chao Luo, Canghong Shi, Xiaojie Li, Computers, Materials & Continua . 2020,第2期

机译：用于扩张心肌病CMR分割和分类的基于注意力的卷积编码器 - 解码器的多任务学习
4. A Shared Multi-Attention Framework for Multi-Label Zero-Shot Learning [C] . Dat Huynh, Ehsan Elhamifar IEEE/CVF Conference on Computer Vision and Pattern Recognition . 2020

机译：共享的多注意零标签学习多框架
5. A new modelling framework for the transit assignment problem: A multi-agent learning-based approach. [D] . Wahba, Mohammed Medhat Amin. 2004

机译：针对交通分配问题的新建模框架：一种基于多主体学习的方法。
6. The Allocation of Attention to Learning of Goal-Directed Actions: A Cognitive Neuroscience Framework Focusing on the Basal Ganglia [O] . E. A. Franz 2012

机译：对目标定向行为的学习注意力的分配：专注于基底神经节的认知神经科学框架。
7. AEDmts: An Attention-Based Encoder-Decoder Framework for Multi-Sensory Time Series Analytic [O] . Jin Fan, Hongkun Wang, Yipan Huang, 2020

机译：AEDMTS：用于多感觉时间序列分析的基于关注的编码器 - 解码器框架

M3LA: A Novel Approach Based on Encoder-Decoder with Attention Framework for Multi-modal Multi-label Learning

摘要

著录项

相似文献

相关主题

期刊订阅