首页> 外文期刊>Future Internet >Occlusion-Aware Unsupervised Learning of Monocular Depth, Optical Flow and Camera Pose with Geometric Constraints
【24h】

Occlusion-Aware Unsupervised Learning of Monocular Depth, Optical Flow and Camera Pose with Geometric Constraints

机译:具有几何约束的单眼深度,光流和相机姿势的咬合意识无监督学习

获取原文
       

摘要

We present an occlusion-aware unsupervised neural network for jointly learning three low-level vision tasks from monocular videos: depth, optical flow, and camera motion. The system consists of three different predicting sub-networks simultaneously coupled by combined loss terms and is capable of computing each task independently on test samples. Geometric constraints extracted from scene geometry which have traditionally been used in bundle adjustment or pose-graph optimization are formed as various self-supervisory signals during our end-to-end learning approach. Different from prior works, our image reconstruction loss also takes account of optical flow. Moreover, we impose novel 3D flow consistency constraints over the predictions of all the three tasks. By explicitly modeling occlusion and taking utilization of both 2D and 3D geometry relationships, abundant geometric constraints are formed over estimated outputs, enabling the system to capture both low-level representations and high-level cues to infer thinner scene structures. Empirical evaluation on the KITTI dataset demonstrates the effectiveness and improvement of our approach: (1) monocular depth estimation outperforms state-of-the-art unsupervised methods and is comparable to stereo supervised ones; (2) optical flow prediction ranks top among prior works and even beats supervised and traditional ones especially in non-occluded regions; (3) pose estimation outperforms established SLAM systems under comparable input settings with a reasonable margin.
机译:我们提出了一种可识别遮挡的无监督神经网络,用于从单眼视频中共同学习三个低级视觉任务:深度,光流和相机运动。该系统由三个不同的预测子网组成,这些子网同时通过组合的损耗项耦合,并且能够独立于测试样本计算每个任务。在我们的端到端学习方法中,从场景几何中提取的几何约束(传统上已用于束调整或姿势图优化)形成为各种自我监控信号。与先前的工作不同,我们的图像重建损失还考虑了光流。此外,我们在所有三个任务的预测上强加了新颖的3D流一致性约束。通过显式建模遮挡并利用2D和3D几何关系,在估计的输出上形成了丰富的几何约束,使系统能够捕获低级表示和高级提示以推断出较薄的场景结构。对KITTI数据集的实证评估证明了我们方法的有效性和改进:(1)单眼深度估计优于最新的无监督方法,可与立体监督方法相提并论; (2)光流预测在以前的工作中排名最高,甚至在有监督和传统的节拍中尤为突出,尤其是在非遮挡区域。 (3)姿态估计在可比较的输入设置下以合理的余量胜过已建立的SLAM系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号