首页> 外文会议>IEEE/CVF Conference on Computer Vision and Pattern Recognition >Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching
【24h】

Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching

机译:高分辨率多视图立体声和立体声匹配的级联成本量

获取原文

摘要

The deep multi-view stereo (MVS) and stereo matching approaches generally construct 3D cost volumes to regularize and regress the output depth or disparity. These methods are limited when high-resolution outputs are needed since the memory and time costs grow cubically as the volume resolution increases. In this paper, we propose a both memory and time efficient cost volume formulation that is complementary to existing multi-view stereo and stereo matching approaches based on 3D cost volumes. First, the proposed cost volume is built upon a standard feature pyramid encoding geometry and context at gradually finer scales. Then, we can narrow the depth (or disparity) range of each stage by the depth (or disparity) map from the previous stage. With gradually higher cost volume resolution and adaptive adjustment of depth (or disparity) intervals, the output is recovered in a coarser to fine manner. We apply the cascade cost volume to the representative MVS-Net, and obtain a 35.6% improvement on DTU benchmark (1st place), with 50.6% and 59.3% reduction in GPU memory and run-time. It is also the state-of-the-art learning-based method on Tanks and Temples benchmark. The statistics of accuracy, run-time and GPU memory on other representative stereo CNNs also validate the effectiveness of our proposed method. Our source code is available at https://github.com/alibaba/cascade-stereo.
机译:深度多视图立体(MVS)和立体匹配方法通常构造3D成本量,以规范化和回归输出深度或视差。当需要高分辨率输出时,这些方法会受到限制,因为随着体积分辨率的提高,内存和时间成本将呈立方增长。在本文中,我们提出了一种既节省内存又节省时间的成本公式,以补充现有的多视图立体和基于3D成本的立体匹配方法。首先,建议的成本量是建立在标准特征金字塔编码几何结构和上下文的基础上的,而且逐渐缩小。然后,我们可以通过上一个阶段的深度(或视差)图来缩小每个阶段的深度(或视差)范围。随着成本体积分辨率的逐步提高和深度(或视差)间隔的自适应调整,输出将以较粗糙到精细的方式恢复。我们将级联成本量应用于代表性的MVS-Net,并在DTU基准上获得了35.6%的提升(第一名),GPU内存和运行时间分别减少了50.6%和59.3%。它也是基于Tanks和Temples基准的最先进的基于学习的方法。其他代表性立体声CNN的准确性,运行时间和GPU内存的统计数据也验证了我们提出的方法的有效性。我们的源代码位于https://github.com/alibaba/cascade-stereo。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号