Discriminative Multi-modal Feature Fusion for RGBD Indoor Scene Recognition

机译：区分性多模态特征融合用于RGBD室内场景识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

RGBD scene recognition has attracted increasingly attention due to the rapid development of depth sensors and their wide application scenarios. While many research has been conducted, most work used hand-crafted features which are difficult to capture high-level semantic structures. Recently, the feature extracted from deep convolutional neural network has produced state-of-the-art results for various computer vision tasks, which inspire researchers to explore incorporating CNN learned features for RGBD scene understanding. On the other hand, most existing work combines rgb and depth features without adequately exploiting the consistency and complementary information between them. Inspired by some recent work on RGBD object recognition using multi-modal feature fusion, we introduce a novel discriminative multi-modal fusion framework for rgbd scene recognition for the first time which simultaneously considers the inter-and intra-modality correlation for all samples and meanwhile regularizing the learned features to be discriminative and compact. The results from the multimodal layer can be back-propagated to the lower CNN layers, hence the parameters of the CNN layers and multimodal layers are updated iteratively until convergence. Experiments on the recently proposed large scale SUN RGB-D datasets show that our method achieved the state-of-the-art without any image segmentation.

机译：由于深度传感器的快速发展及其广泛的应用场景，RGBD场景识别已引起越来越多的关注。尽管进行了许多研究，但大多数工作都是使用手工制作的功能，这些功能很难捕获高级语义结构。最近，从深度卷积神经网络中提取的特征已针对各种计算机视觉任务产生了最新技术成果，这激发了研究人员探索结合CNN学习的特征以了解RGBD场景的知识。另一方面，大多数现有工作都将rgb和depth功能结合在一起，而没有充分利用它们之间的一致性和补充信息。受最近使用多模式特征融合进行RGBD对象识别的一些工作的启发，我们首次引入了一种新颖的可区分多模式融合框架，用于rgbd场景识别，该框架同时考虑了所有样本的模态内和模态内相关性，同时规范化学习到的特征，使其具有区分性和紧凑性。多模态层的结果可以反向传播到较低的CNN层，因此，CNN层和多模态层的参数会迭代更新，直到收敛为止。对最近提出的大规模SUN RGB-D数据集进行的实验表明，我们的方法无需任何图像分割即可实现最新技术。

著录项

来源
《IEEE Conference on Computer Vision and Pattern Recognition》|2016年|2969-2976|共8页
会议地点
作者
Hongyuan Zhu; Jean-Baptiste Weibel; Shijian Lu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Correlation; Image color analysis; Image recognition; Computer vision; Sensors; Image segmentation; Object recognition;

机译：相关性图像色彩分析图像识别计算机视觉传感器图像分割对象识别;

相似文献

外文文献
中文文献
专利

1. Deep feature fusion through adaptive discriminative metric learning for scene recognition [J] . Wang Chen, Peng Guohua, De Baets Bernard Information Fusion . 2020,第1期

机译：通过自适应鉴别度量学习实现场景识别的深度特征融合
2. Multi-Modal Weights Sharing and Hierarchical Feature Fusion for RGBD Salient Object Detection [J] . Xiao Fen, Li Bin, Peng Yimu, Quality Control, Transactions . 2020,第期

机译：RGBD突出对象检测的多模态权重共享和分层特征融合
3. Self-weighted discriminative metric learning based on deep features for scene recognition [J] . Wang Chen, Peng Guohua, Lin Wei Multimedia Tools and Applications . 2020,第3a4期

机译：基于场景识别深度特征的自重鉴别度量学习
4. Discriminative Multi-modal Feature Fusion for RGBD Indoor Scene Recognition [C] . Hongyuan Zhu, Jean-Baptiste Weibel, Shijian Lu IEEE Conference on Computer Vision and Pattern Recognition . 2016

机译：RGBD室内场景识别的判别多模态特征融合
5. RGBD Pipeline for Indoor Scene Reconstruction and Understanding [D] . ?Halber, Maciej Stanislaw 2019

机译：RGBD管道用于室内场景重建和理解
6. Visual Scene-Aware Hybrid and Multi-Modal Feature Aggregation for Facial Expression Recognition [O] . Min Kyu Lee, Dae Ha Kim, Byung Cheol Song 2020

机译：面部表情识别的视觉场景感知混合和多模态特征聚合
7. A Discriminative Representation of Convolutional Features for Indoor Scene Recognition [O] . Khan, Salman H., Hayat, Munawar, Bennamoun, Mohammed, 2015

机译：室内环境卷积特征的判别表示场景识别
8. Feature Extraction and Object Recognition in Multi-Modal Forward Looking Imagery [R] . Greenwood, G., Blakely, S., Schartman, D., 2011

机译：多模态前视图像中的特征提取与目标识别

Discriminative Multi-modal Feature Fusion for RGBD Indoor Scene Recognition

摘要

著录项

相似文献

相关主题

期刊订阅