首页> 外文会议>European conference on computer vision >Can Ground Truth Label Propagation from Video Help Semantic Segmentation?
【24h】

Can Ground Truth Label Propagation from Video Help Semantic Segmentation?

机译:可以从视频帮助语义细分中实现真相标签传播吗?

获取原文

摘要

For state-of-the-art semantic segmentation task, training convolutional neural networks (CNNs) requires dense pixelwise ground truth (GT) labeling, which is expensive and involves extensive human effort. In this work, we study the possibility of using auxiliary ground truth, so-called pseudo ground truth (PGT) to improve the performance. The PGT is obtained by propagating the labels of a GT frame to its subsequent frames in the video using a simple CRF-based, cue integration framework. Our main contribution is to demonstrate the use of noisy PGT along with GT to improve the performance of a CNN. We perform a systematic analysis to find the right kind of PGT that needs to be added along with the GT for training a CNN. In this regard, we explore three aspects of PGT which influence the learning of a CNN: (ⅰ) the PGT labeling has to be of good quality; (ⅱ) the PGT images have to be different compared to the GT images; (ⅲ) the PGT has to be trusted differently than GT. We conclude that PGT which is diverse from GT images and has good quality of labeling can indeed help improve the performance of a CNN. Also, when PGT is multiple folds larger than GT, weighing down the trust on PGT helps in improving the accuracy. Finally, We show that using PGT along with GT, the performance of Fully Convolutional Network (FCN) on Camvid data is increased by 2.7% on IoU accuracy. We believe such an approach can be used to train CNNs for semantic video segmentation where sequentially labeled image frames are needed. To this end, we provide recommendations for using PGT strategically for semantic segmentation and hence bypass the need for extensive human efforts in labeling.
机译:对于最先进的语义分割任务,培训卷积神经网络(CNNS)需要密集的像素地面真理(GT)标签,这是昂贵的并且涉及广泛的人类努力。在这项工作中,我们研究了使用辅助地面真理,所谓的伪原理(PGT)的可能性来提高性能。使用简单的基于CRF的CUE积分框架将GT帧的标签传播到视频中的后续帧来获得PGT。我们的主要贡献是展示使用嘈杂的PGT以及GT来提高CNN的性能。我们执行系统分析以找到需要添加的合适的PGT,以及用于训练CNN的GT。在这方面,我们探讨了影响CNN的学习的PGT的三个方面:(Ⅰ)PGT标签必须具有良好的质量; (Ⅱ)与GT图像相比,PGT图像必须不同; (Ⅲ)PGT必须与GT相比不同。我们得出结论,从GT图像中多样化的PGT并具有良好质量的标签可以确实有助于提高CNN的性能。此外,当PGT多于GT的多个折叠时,称量对PGT的信任有助于提高准确性。最后,我们表明,使用PGT以及GT,CAMVID数据上的完全卷积网络(FCN)的性能增加了2.7%的IOU精度。我们认为这种方法可用于训练CNNS用于需要顺序标记的图像帧的语义视频分段。为此,我们为战略性地为语义分割提供了建议,并因此绕过了在标签中进行广泛的人类努力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号