...
首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >Video semantic segmentation via feature propagation with holistic attention
【24h】

Video semantic segmentation via feature propagation with holistic attention

机译:视频语义分割通过具有整体关注的特征传播

获取原文
获取原文并翻译 | 示例
           

摘要

Since the frames of a video are inherently contiguous, information redundancy is ubiquitous. Unlike previous works densely process each frame of a video, in this paper we present a novel method to focus on efficient feature propagation across frames to tackle the challenging video semantic segmentation task. Firstly, we propose a Light, Efficient and Real-time network (denoted as LERNet) as a strong backbone network for per-frame processing. Then we mine rich features within a key frame and propagate the across-frame consistency information by calculating a temporal holistic attention with the following non-key frame. Each element of the attention matrix represents the global correlation between pixels of a non-key frame and the previous key frame. Concretely, we propose a brand-new attention module to capture the spatial consistency on low-level features along temporal dimension. Then we employ the attention weights as a spatial transition guidance for directly generating high-level features of the current non-key frame from the weighted corresponding key frame. Finally, we efficiently fuse the hierarchical features of the non-key frame and obtain the final segmentation result. Extensive experiments on two popular datasets, i.e. the CityScapes and the CamVid, demonstrate that the proposed approach achieves a remarkable balance between inference speed and accuracy. (C) 2020 Elsevier Ltd. All rights reserved.
机译:由于视频的帧是固有的连续性,因此信息冗余是普遍存在的。与以前的作品不同,处理每个帧的视频,在本文中,我们介绍了一种专注于跨框架的有效特征传播的新方法来解决具有挑战性的视频语义分段任务。首先,我们提出了一种光,高效和实时网络(表示为Lernet)作为用于每个帧处理的强骨干网络。然后我们通过以下非关键帧计算时间整体注意,在关键框架内进行丰富的功能,并通过以下非关键帧来传播跨帧一致性信息。注意矩阵的每个元素表示非关键帧和先前密钥帧的像素之间的全局相关性。具体地,我们提出了一个全新的注意模块,以沿着时间尺寸捕获对低电平特征的空间一致性。然后,我们使用注意力作为空间转换引导,用于直接从加权对应的关键帧直接产生电流非键帧的高级特征。最后,我们有效地融合了非关键帧的分层特征,并获得了最终的分段结果。在两个流行的数据集上进行广泛的实验,即城市景观和Camvid,表明所提出的方法在推广速度和准确性之间实现了显着平衡。 (c)2020 elestvier有限公司保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号