Single Image Depth Estimation With Normal Guided Scale Invariant Deep Convolutional Fields

Yan Han; Yu Xin; Zhang Yu; Zhang Shunli; Zhao Xiaolin; Zhang Li

首页> 外文期刊>IEEE Transactions on Circuits and Systems for Video Technology >Single Image Depth Estimation With Normal Guided Scale Invariant Deep Convolutional Fields

【24h】

Single Image Depth Estimation With Normal Guided Scale Invariant Deep Convolutional Fields

机译：具有法向制导尺度不变深卷积场的单图像深度估计

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Estimating scene depth from a single image can be widely applied to understand 3D environments due to the easy access of the images captured by consumer-level cameras. Previous works exploit conditional random fields (CRFs) to estimate image depth, where neighboring pixels (superpixels) with similar appearances are constrained to share the same depth. However, the depth may vary significantly in the slanted surface, thus leading to severe estimation errors. In order to eliminate those errors, we propose a superpixel-based normal guided scale invariant deep convolutional field by encouraging the neighboring superpixels with similar appearance to lie on the same 3D plane of the scene. In doing so, a depth-normal multitask CNN is introduced to produce the superpixel-wise depth and surface normal predictions simultaneously. To correct the errors of the roughly estimated superpiexl-wise depth, we develop a normal guided scale invariant CRF (NGSI-CRF). NGSI-CRF consists of a scale invariant unary potential that is able to measure the relative depth between superpixels as well as the absolute depth of superpixels, and a normal guided pairwise potential that constrains spatial relationships between superpixels in accordance with the 3D layout of the scene. In other words, the normal guided pairwise potential is designed to smooth the depth prediction without deteriorating the 3D structure of the depth prediction. The superpixel-wise depth maps estimated by NGSI-CRF will be fed into a pixel-wise refinement module to produce a smooth fine-grained depth prediction. Furthermore, we derive a closed-form solution for the maximum a posteriori (MAP) inference of NGSI-CRF. Thus, our proposed network can be efficiently trained in an end-to-end manner. We conduct our experiments on various datasets, such as NYU-D2, KITTI, and Make 3D. As demonstrated in the experimental results, our method achieves superior performance in both indoor and outdoor scenes.

机译：由于可以轻松访问消费级相机捕获的图像，因此从单个图像估计景深可以广泛应用于理解3D环境。先前的工作利用条件随机场（CRF）来估计图像深度，其中具有相似外观的相邻像素（超像素）被约束为共享相同的深度。但是，倾斜表面的深度可能会发生显着变化，从而导致严重的估计误差。为了消除这些错误，我们通过鼓励具有相似外观的相邻超像素位于场景的同一3D平面上，提出了基于超像素的法向制导尺度不变的深度卷积场。为此，引入了深度法线多任务CNN以同时生成超像素级深度和表面法线预测。为了纠正粗略估计的超螺旋深度的误差，我们开发了一个正常的导标不变CRF（NGSI-CRF）。 NGSI-CRF由能够测量超像素之间的相对深度以及超像素绝对深度的比例不变一元电势以及根据场景的3D布局约束超像素之间空间关系的法向引导成对势构成。换句话说，法向引导的成对电位被设计成平滑深度预测而不会恶化深度预测的3D结构。 NGSI-CRF估计的超像素深度图将被馈送到像素精细化模块中，以产生平滑的细粒度深度预测。此外，我们为NGSI-CRF的最大后验（MAP）推导得出了一种封闭形式的解决方案。因此，我们提出的网络可以以端到端的方式有效地训练。我们在各种数据集上进行实验，例如NYU-D2，KITTI和Make 3D。如实验结果所示，我们的方法在室内和室外场景中均具有出色的性能。

著录项

来源
《IEEE Transactions on Circuits and Systems for Video Technology》 |2019年第1期|80-92|共13页
作者
Yan Han; Yu Xin; Zhang Yu; Zhang Shunli; Zhao Xiaolin; Zhang Li;
展开▼
作者单位

Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China;

Australian Natl Univ, Coll Engn & Comp Sci, Canberra, ACT 2601, Australia;

Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China;

Beijing Jiaotong Univ, Sch Software Engn, Beijing 100044, Peoples R China;

Air Force Engn Univ, Sch Aeronaut & Astronaut Engn, Xian 710038, Shaanxi, Peoples R China;

Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Depth prediction; NGSI-CRF; multitask CNN;

机译：深度预测NGSI-CRF多任务CNN;

相似文献

外文文献
中文文献
专利

1. Learning Depth from Single Monocular Images Using Deep Convolutional Neural Fields [J] . Fayao Liu, Chunhua Shen, Guosheng Lin, IEEE Transactions on Pattern Analysis and Machine Intelligence . 2016,第10期

机译：使用深度卷积神经场从单眼图像中学习深度
2. Deep Color Guided Coarse-to-Fine Convolutional Network Cascade for Depth Image Super-Resolution [J] . Yang Wen, Bin Sheng, Ping Li, IEEE Transactions on Image Processing . 2019,第2期

机译：用于深度图像超分辨率的深色引导的粗到细卷积网络级联
3. Hand Pose Estimation from Single Depth Images with 3D Convolutional Neural Network [J] . Zelin Zhang, Zhao Wang, Jun Ohya 電子情報通信学会技術研究報告. パターン認識·メディア理解. Pattern Recognition and Media Understanding . 2017,第391期

机译：用3D卷积神经网络的单一深度图像手姿势估计
4. Deep convolutional neural fields for depth estimation from a single image [C] . Fayao Liu, Chunhua Shen, Guosheng Lin IEEE Conference on Computer Vision and Pattern Recognition . 2015

机译：深度卷积神经字段，用于从单个图像进行深度估计
5. Convolutional neural network based age estimation from facial image and depth prediction from single image. [D] . Qiu, Jiayan. 2016

机译：基于卷积神经网络的基于面部图像的年龄估计和基于单个图像的深度预测。
6. Depth Reconstruction from Single Images Using a Convolutional Neural Network and a Condition Random Field Model [O] . Dan Liu, Xuejun Liu, Yiguang Wu 2018

机译：使用卷积神经网络和条件随机场模型从单个图像进行深度重构
7. Deep convolutional neural fields for depth estimation from a single image [O] . Liu, F., Shen, C., Lin, G. 2015

机译：用于从单个图像进行深度估计的深度卷积神经场

Single Image Depth Estimation With Normal Guided Scale Invariant Deep Convolutional Fields

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅