...
首页> 外文期刊>IEEE transactions on multimedia >Bayesian DeNet: Monocular Depth Prediction and Frame-Wise Fusion With Synchronized Uncertainty
【24h】

Bayesian DeNet: Monocular Depth Prediction and Frame-Wise Fusion With Synchronized Uncertainty

机译:贝叶斯DeNet:具有同步不确定性的单眼深度预测和帧明智融合

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Using deep convolutional neural networks (CNN) to predict the depth from a single image has received considerable attention in recent years due to its impressive performance. However, existing methods process each single image independently without leveraging the multiview information of video sequences in practical scenarios. Properly taking into account multiview information in video sequences beyond individual frames could offer considerable benefits in terms of depth prediction accuracy and robustness. In addition, a meaningful measure of prediction uncertainty is essential for decision making, which is not provided in existing methods. This paper presents a novel video-based depth prediction system based on a monocular camera, named Bayesian DeNet. Specifically, Bayesian DeNet consists of a 59-layer CNN that can concurrently output a depth map and an uncertainty map for each video frame. Each pixel in an uncertainty map indicates the error variance of the corresponding depth estimate. Depth estimates and uncertainties of previous frames are propagated to the current frame based on the tracked camera pose, yielding multiple depth/uncertainty hypotheses for the current frame which are then fused in a Bayesian inference framework for greater accuracy and robustness. Extensive exper-iments on three public datasets demonstrate that our Bayesian DeNet outperforms the state-of-the-art methods for monocular depth prediction. A demo video and code are publicly available.(1)
机译:近年来,由于其令人印象深刻的性能,使用深度卷积神经网络(CNN)来预测单个图像的深度已引起了广泛的关注。然而,在实际情况下,现有方法独立地处理每个单个图像,而不利用视频序列的多视图信息。在深度预测的准确性和鲁棒性方面,适当考虑视频序列中超出单个帧的多视图信息可能会带来可观的好处。另外,预测不确定性的有意义的度量对于决策至关重要,而现有方法没有提供。本文提出了一种新颖的基于视频的基于单眼相机的深度预测系统,称为贝叶斯DeNet。具体来说,贝叶斯DeNet由59层CNN组成,可以同时为每个视频帧输出深度图和不确定度图。不确定度图中的每个像素指示相应深度估计的误差方差。基于跟踪的相机姿态,将先前帧的深度估计和不确定性传播到当前帧,从而产生当前帧的多个深度/不确定性假设,然后将这些假设融合在贝叶斯推理框架中,以实现更高的准确性和鲁棒性。在三个公共数据集上的大量实验表明,我们的贝叶斯DeNet优于单眼深度预测的最新方法。演示视频和代码可公开获得。(1)

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号