Viewpoint invariant semantic object and scene categorization with RGB-D sensors

Zaki Hasan F. M.; Shafait Faisal; Mian Ajmal

首页> 外文期刊>Autonomous robots >Viewpoint invariant semantic object and scene categorization with RGB-D sensors

【24h】

Viewpoint invariant semantic object and scene categorization with RGB-D sensors

机译：ViewPoint不变语义对象和RGB-D传感器的场景分类

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Understanding the semantics of objects and scenes using multi-modal RGB-D sensors serves many robotics applications. Key challenges for accurate RGB-D image recognition are the scarcity of training data, variations due to viewpoint changes and the heterogeneous nature of the data. We address these problems and propose a generic deep learning framework based on a pre-trained convolutional neural network, as a feature extractor for both the colour and depth channels. We propose a rich multi-scale feature representation, referred to as convolutional hypercube pyramid (HP-CNN), that is able to encode discriminative information from the convolutional tensors at different levels of detail. We also present a technique to fuse the proposed HP-CNN with the activations of fully connected neurons based on an extreme learning machine classifier in a late fusion scheme which leads to a highly discriminative and compact representation. To further improve performance, we devise HP-CNN-T which is a view-invariant descriptor extracted from a multi-view 3D object pose (M3DOP) model. M3DOP is learned from over 140,000 RGB-D images that are synthetically generated by rendering CAD models from different viewpoints. Extensive evaluations on four RGB-D object and scene recognition datasets demonstrate that our HP-CNN and HP-CNN-T consistently outperforms state-of-the-art methods for several recognition tasks by a significant margin.

机译：了解使用多模态RGB-D传感器的对象和场景的语义提供许多机器人应用程序。准确的RGB-D图像识别的关键挑战是训练数据的稀缺性，视点变化和数据的异构性质导致的变化。我们解决了这些问题并提出了一种基于预先训练的卷积神经网络的通用深层学习框架，作为颜色和深度通道的特征提取器。我们提出了丰富的多尺度特征表示，称为卷积超立体金字塔（HP-CNN），其能够以不同的细节水平从卷积张量编码辨别信息。我们还提出了一种技术来利用基于后期融合方案的极端学习机分类器来熔化所提出的HP-CNN的技术，其在后期融合方案中导致高度辨别和紧凑的表示。为了进一步提高性能，我们设计了从多视图3D对象姿势（M3DOP）模型中提取的视图 - CNN-T. M3DOP从超过140,000个RGB-D图像中学到的，通过从不同的视点渲染CAD模型来综合生成。四个RGB-D对象和场景识别数据集的广泛评估表明，我们的HP-CNN和HP-CNN-T一致地优于最先进的方法，以通过显着的余量来实现若干识别任务。

著录项

来源
《Autonomous robots》 |2019年第4期|共18页
作者
Zaki Hasan F. M.; Shafait Faisal; Mian Ajmal;
展开▼
作者单位

Int Islamic Univ Malaysia Dept Mechatron Engn Kuala Lumpur 53100 Malaysia;

Natl Univ Sci &

Technol Islamabad Pakistan;

Univ Western Australia Sch Comp Sci &

Software Engn Crawley WA 6009 Australia;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类机器人技术;
关键词
Object categorization; Scene recognition; RGB-D image; Multi-modal deep learning;

机译：对象分类;场景识别;RGB-D图像;多模态深度学习;

相似文献

外文文献
中文文献
专利

1. Viewpoint invariant semantic object and scene categorization with RGB-D sensors [J] . Zaki Hasan F. M., Shafait Faisal, Mian Ajmal Autonomous robots . 2019,第4期

机译：ViewPoint不变语义对象和RGB-D传感器的场景分类
2. Indoor Scene Understanding with RGB-D Images: Bottom-up Segmentation, Object Detection and Semantic Segmentation [J] . Gupta Saurabh, Arbelaez Pablo, Girshick Ross, International Journal of Computer Vision . 2015,第2期

机译：使用RGB-D图像了解室内场景：自底向上分割，对象检测和语义分割
3. Semantic parsing for priming object detection in indoors RGB-D scenes [J] . Cesar Cadena, Jana Kosecka The International journal of robotics research . 2015,第4a5期

机译：用于室内RGB-D场景中的启动对象检测的语义解析
4. Unsupervised Learning of Semantics of Object Detections for Scene Categorization [C] . Grégoire Mesnil, Salah Rifai, Antoine Bordes, International Conference on Pattern Recognition . 2015

机译：无监督学习场景分类对象检测的语义
5. Object Recognition and Semantic Scene Labeling for RGB-D Data. [D] . Lai, Kevin Kar Wai. 2013

机译：RGB-D数据的对象识别和语义场景标记。
6. Saliency-Guided Detection of Unknown Objects in RGB-D Indoor Scenes [O] . Jiatong Bao, Yunyi Jia, Yu Cheng, 2015

机译：RGB-D室内场景中未知对象的显着性引导检测
7. Viewpoint invariant semantic object and scene categorization with RGB-D sensors [O] . Hasan F. M. Zaki, Faisal Shafait, Ajmal Mian 2018

机译：ViewPoint不变语义对象和场景分类与RGB-D传感器

Viewpoint invariant semantic object and scene categorization with RGB-D sensors

摘要

著录项

相似文献

相关主题

期刊订阅