首页> 外文期刊>Neurocomputing >Learning efficient multi-task stereo matching network with richer feature information
【24h】

Learning efficient multi-task stereo matching network with richer feature information

机译:使用更丰富的功能信息学习高效的多任务立体声匹配网络

获取原文
获取原文并翻译 | 示例

摘要

Accurate depth estimation is a research hotspot in the field of stereo vision. The accuracy of stereo matching algorithm directly determines the quality of depth map. Recent researches have transformed stereo matching methods into a supervised learning task. However, the previous methods may have mismatches in the regions of non-textures, boundaries and tiny details. In this paper, we propose a multitask attention stereo network (MASNet) to integrate the feature information from a stereo image pairs for disparity estimation. Firstly, a segmentation attention head module (SAH) is proposed, which adds semantic segmentation clues for disparity estimation, uses global receptive field to guide network feature extraction learning refined features, and alleviates the negative impact of depth addition of the network. Secondly, we construct a multiple cost volume (MCV) to make full use of the aggregation ability of 3D convolution and provide a better similarity measures for disparity estimation. Thirdly, we embed Top k pooling layer into the 3D CNN module to obtain the reduced aggregation feature. The feature is fed into the proposed shallow merging network and fused with the intermediate feature to obtain richer lowlevel features and make up for the comprehensiveness of network neck feature. The results of experiment on Scene Flow, KITTI 2012, and KITTI 2015 datasets show that our proposed network has a significant superiority over state-of-art stereo matching methods. (c) 2020 Elsevier B.V. All rights reserved.
机译:精确的深度估计是立体视野领域的研究热点。立体声匹配算法的准确性直接确定深度图的质量。最近的研究将立体声匹配方法转化为监督学习任务。然而,以前的方法可能在非纹理,边界和微小细节的区域中具有不匹配。在本文中,我们提出了一个多任务注意立体网络(MASNet)来集成来自立体图像对的特征信息以进行视差估计。首先,提出了一种分割注意力头模块(SAH),它为视差估计添加了语义分段线索,使用全局接收领域引导网络特征提取学习精细特征,并减轻网络的深度增加的负面影响。其次,我们构建多重成本(MCV)以充分利用3D卷积的聚合能力,并为差异估计提供更好的相似性措施。第三,我们将顶部K池层嵌入到3D CNN模块中以获得减少的聚合特征。该特征被馈送到所提出的浅合并网络中,并与中间功能融合,以获得更丰富的Lowlevel特征,并弥补网络颈部的全面性。在场景流,基蒂2012和基蒂2015年数据集的实验结果表明,我们所提出的网络对最先进的立体声匹配方法具有显着的优势。 (c)2020 Elsevier B.v.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号