首页> 外文期刊>Neurocomputing >Unsupervised depth estimation from monocular videos with hybrid geometric-refined loss and contextual attention
【24h】

Unsupervised depth estimation from monocular videos with hybrid geometric-refined loss and contextual attention

机译:具有混合几何优化损失和上下文关注的单眼视频的无监督深度估计

获取原文
获取原文并翻译 | 示例
       

摘要

Most existing methods based on convolutional neural networks (CNNs) are supervised, which require a large amount of ground-truth data for training. Recently, some unsupervised methods utilize stereo image pairs as input by transforming depth estimation into a view synthesis problem, but need stereo camera as an additional equipment for data acquisition. Therefore, we use more available monocular videos captured from monocular camera as our input, and propose an unsupervised learning framework to predict scene depth maps from monocular video frames. First, we design a novel unsupervised hybrid geometricrefined loss, which can explicitly explore more accurate geometric relationship between the input color image and the predicted depth map, and preserve depth boundaries and fine structures in depth maps. Then, we design a contextual attention module to capture nonlocal dependencies along the spatial and channel dimensions in a dual path, which can improve the ability of feature representation and further preserve fine depth details. In addition, we also utilize an adversarial loss to discriminate synthetic or realistic color images by training a discriminator so as to produce realistic results. Experimental results demonstrate that the proposed framework achieves comparable or even better results than those trained with monocular videos or stereo image pairs. (C) 2019 Elsevier B.V. All rights reserved.
机译:对大多数基于卷积神经网络(CNN)的方法进行监督,这需要大量的地面数据进行训练。近来,一些无监督的方法通过将深度估计转换为视图合成问题而将立体图像对用作输入,但是需要立体相机作为用于数据采集的附加设备。因此,我们使用从单筒摄像机捕获的更多可用单筒视频作为输入,并提出了一种无监督学习框架来预测单筒视频帧的景深图。首先,我们设计了一种新颖的无监督混合几何细化损失,它可以显式探索输入彩色图像与预测深度图之间的更精确的几何关系,并在深度图中保留深度边界和精细结构。然后,我们设计了一个上下文关注模块,以捕获双路径中沿空间和通道维度的非局部依赖性,这可以提高特征表示的能力并进一步保留精细的深度细节。另外,我们还利用对抗损失通过训练鉴别器来鉴别合成或逼真的彩色图像,从而产生逼真的结果。实验结果表明,与使用单眼视频或立体图像对训练的框架相比,提出的框架可达到甚至更好的结果。 (C)2019 Elsevier B.V.保留所有权利。

著录项

  • 来源
    《Neurocomputing》 |2020年第28期|250-261|共12页
  • 作者单位

    Dalian Univ Technol DUT RU Int Sch Informat Sci & Engn Key Lab Ubiquitous Network & Serv Software Liaoni Dalian Peoples R China|Dalian Univ Technol Sch Math Sci Dalian Peoples R China;

    Dalian Univ Technol DUT RU Int Sch Informat Sci & Engn Key Lab Ubiquitous Network & Serv Software Liaoni Dalian Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Unsupervised; Monocular video; Attention; Hybrid geometric-refined loss;

    机译:无监督;单眼视频;注意;混合几何修正损失;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号