首页> 外文学位 >Object Recognition and Semantic Scene Labeling for RGB-D Data.
【24h】

Object Recognition and Semantic Scene Labeling for RGB-D Data.

机译:RGB-D数据的对象识别和语义场景标记。

获取原文
获取原文并翻译 | 示例

摘要

The availability of RGB-D (Kinect-like) cameras has led to an explosive growth of research on robot perception. RGB-D cameras provide high resolution (640 x 480) synchronized videos of both color (RGB) and depth (D) at 30 frames per second. This dissertation demonstrates the thesis that combining of RGB and depth at high frame rates is helpful for various recognition tasks including object recognition, object detection, and semantic scene labeling. We present the RGB-D Object Dataset, a large dataset of 250,000 RGB-D images of 300 objects in 51 categories, and 22 RGB-D videos of objects in indoor home and office environments. We introduce algorithms for object recognition in RGB-D images that perform category, instance, and pose recognition in a scalable manner. We also present HMP3D, an unsupervised feature learning approach for 3D point cloud data, and demonstrate that HMP3D can be used to learn hierarchies of features from different attributes including color, gradient, shape, and surface normal orientation. Finally, we present a scene labeling approach for scenes constructed from RGB-D videos. The approach uses features learned from both individual RGB-D images and 3D point clouds constructed from entire video sequences. Through these applications, this thesis demonstrates the importance of designing new features and algorithms that specifically utilize the advantages of RGB-D cameras over traditional cameras and range sensors.
机译:RGB-D(类似Kinect)相机的可用性导致机器人感知研究的爆炸性增长。 RGB-D摄像机以每秒30帧的速度提供彩色(RGB)和深度(D)的高分辨率(640 x 480)高分辨率同步视频。论文证明了在高帧频下RGB和深度的结合对物体识别,物体检测和语义场景标记等各种识别任务有帮助。我们介绍了RGB-D对象数据集,一个大型数据集,该数据集包含51个类别中300个对象的250,000个RGB-D图像,以及22个对象在室内家庭和办公室环境中的RGB-D视频。我们介绍了用于RGB-D图像中对象识别的算法,该算法以可扩展的方式执行类别,实例和姿势识别。我们还介绍了HMP3D,这是一种用于3D点云数据的无监督特征学习方法,并演示了HMP3D可用于从不同属性(包括颜色,渐变,形状和表面法线方向)学习特征的层次结构。最后,我们提出一种针对RGB-D视频构建的场景的场景标记方法。该方法使用从单个RGB-D图像和从整个视频序列构建的3D点云中学习到的功能。通过这些应用,本文证明了设计新功能和算法的重要性,这些功能和算法专门利用RGB-D摄像机比传统摄像机和距离传感器的优势。

著录项

  • 作者

    Lai, Kevin Kar Wai.;

  • 作者单位

    University of Washington.;

  • 授予单位 University of Washington.;
  • 学科 Computer Science.;Artificial Intelligence.;Engineering Robotics.
  • 学位 Ph.D.
  • 年度 2013
  • 页码 172 p.
  • 总页数 172
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号