A Deep Multi-Modal CNN for Multi-Instance Multi-Label Image Classification

Lingyun Song; Jun Liu; Buyue Qian; Mingxuan Sun; Kuan Yang; Meng Sun; Samar Abbas

首页> 外文期刊>Image Processing, IEEE Transactions on >A Deep Multi-Modal CNN for Multi-Instance Multi-Label Image Classification

【24h】

A Deep Multi-Modal CNN for Multi-Instance Multi-Label Image Classification

机译：用于多实例多标签图像分类的深度多模态CNN

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Deep convolutional neural networks (CNNs) have shown superior performance on the task of single-label image classification. However, the applicability of CNNs to multi-label images still remains an open problem, mainly because of two reasons. First, each image is usually treated as an inseparable entity and represented as one instance, which mixes the visual information corresponding to different labels. Second, the correlations amongst labels are often overlooked. To address these limitations, we propose a deep multi-modal CNN for multi-instance multi-label image classification, called MMCNN-MIML. By combining CNNs with multi-instance multi-label (MIML) learning, our model represents each image as a bag of instances for image classification and inherits the merits of both CNNs and MIML. In particular, MMCNN-MIML has three main appealing properties: 1) it can automatically generate instance representations for MIML by exploiting the architecture of CNNs; 2) it takes advantage of the label correlations by grouping labels in its later layers; and 3) it incorporates the textual context of label groups to generate multi-modal instances, which are effective in discriminating visually similar objects belonging to different groups. Empirical studies on several benchmark multi-label image data sets show that MMCNN-MIML significantly outperforms the state-of-the-art baselines on multi-label image classification tasks.

机译：深度卷积神经网络（CNN）在单标签图像分类任务中显示了卓越的性能。但是，主要由于两个原因，CNN在多标签图像上的适用性仍然是一个未解决的问题。首先，每个图像通常被视为一个不可分割的实体，并被表示为一个实例，它混合了对应于不同标签的视觉信息。其次，标签之间的相关性经常被忽略。为了解决这些限制，我们提出了一种用于多实例多标签图像分类的深层多模式CNN，称为MMCNN-MIML。通过将CNN与多实例多标签（MIML）学习相结合，我们的模型将每个图像表示为用于图像分类的实例包，并继承了CNN和MIML的优点。特别地，MMCNN-MIML具有三个主要吸引人的属性：1）它可以通过利用CNN的体系结构自动生成MIML的实例表示； 2）通过在后面的层中对标签进行分组来利用标签的相关性； 3）它结合了标签组的文本上下文以生成多模式实例，这对于区分属于不同组的视觉相似对象是有效的。对几个基准多标签图像数据集的经验研究表明，MMCNN-MIML在多标签图像分类任务上明显优于最新的基准。

著录项

来源
《Image Processing, IEEE Transactions on》 |2018年第12期|6025-6038|共14页
作者
Lingyun Song; Jun Liu; Buyue Qian; Mingxuan Sun; Kuan Yang; Meng Sun; Samar Abbas;
展开▼
作者单位

Department of Computer Science and Technology, National Engineering Lab of Big Data Analytics, Xi’an Jiaotong University, Xi’an, China;

Department of Computer Science and Technology, National Engineering Lab of Big Data Analytics, Xi’an Jiaotong University, Xi’an, China;

Department of Computer Science and Technology, National Engineering Lab of Big Data Analytics, Xi’an Jiaotong University, Xi’an, China;

Division of Computer Science and Engineering, School of Electrical Engineering and Computer Science, Louisiana State University, Baton Rouge, LA, USA;

Department of Computer Science and Technology, SPKLSTN Lab, Xi’an Jiaotong University, Xi’an, China;

School of Foreign Studies, Xi’an Jiaotong University, Xi’an, China;

Department of Computer Science and Technology, SPKLSTN Lab, Xi’an Jiaotong University, Xi’an, China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Visualization; Task analysis; Correlation; Feature extraction; Computer science; Electronic mail; Sun;

机译：可视化;任务分析;关联;特征提取;计算机科学;电子邮件;Sun;
入库时间 2022-08-17 13:09:48

相似文献

外文文献
中文文献
专利

1. Complex Object Classification: A Multi-Modal Multi-Instance Multi-Label Deep Network with Optimal Transport [J] . Yang Yang, Yi-Feng Wu, De-Chuan Zhan, SIGKDD explorations . 2018,第Udisk期

机译：复杂对象分类：具有最佳运输的多模态多实例多标签深网络
2. Semi-Supervised Multi-Modal Multi-Instance Multi-Label Deep Network with Optimal Transport [J] . Yang Yang, Fu Zhao-Yang, Zhan De-Chuan, IEEE Transactions on Knowledge and Data Engineering . 2021,第2期

机译：半监控多模态多实例多标签深网络，最佳运输
3. An Explainable Multi-Instance Multi-Label Classification Model for Full Slice Brain CT Images [J] . Changwei Song, Guanghui Fu, Jianqiang Li, IFAC PapersOnLine . 2020,第5期

机译：用于全切片脑CT图像的可解释的多实例多标签分类模型
4. Transferring CNNS to multi-instance multi-label classification on small datasets [C] . Mingzhi Dong, Kunkun Pang, Yang Wu, IEEE International Conference on Image Processing . 2017

机译：在小型数据集上将CNNS转换为多实例多标签分类
5. UAV Imagery for Tree Species Classification in Hawai'i: A Comparison of MLC, RF, and CNN Supervised Classification [D] . Ford, Derek James. 2020

机译：夏威夷树种类的无人机图像：MLC，RF和CNN监督分类的比较
6. Multi-instance Multi-label Learning for Multi-class Classification of Whole Slide Breast Histopathology Images [O] . Caner Mercan, Selim Aksoy, Ezgi Mercan, -1

机译：多实例多标签学习对滑膜乳腺组织病理学图像进行多类分类
7. Multi-Instance Multi-Label Learning for Multi-Class Classification of Whole Slide Breast Histopathology Images [O] . Caner Mercan, Selim Aksoy, Ezgi Mercan, 2018

机译：多阶级分类的多实例多标签学习整体乳腺组织病理学图像

A Deep Multi-Modal CNN for Multi-Instance Multi-Label Image Classification

摘要

著录项

相似文献

相关主题

期刊订阅