Can Ground Truth Label Propagation from Video Help Semantic Segmentation?

机译：可以从视频帮助语义细分中实现真相标签传播吗？

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

For state-of-the-art semantic segmentation task, training convolutional neural networks (CNNs) requires dense pixelwise ground truth (GT) labeling, which is expensive and involves extensive human effort. In this work, we study the possibility of using auxiliary ground truth, so-called pseudo ground truth (PGT) to improve the performance. The PGT is obtained by propagating the labels of a GT frame to its subsequent frames in the video using a simple CRF-based, cue integration framework. Our main contribution is to demonstrate the use of noisy PGT along with GT to improve the performance of a CNN. We perform a systematic analysis to find the right kind of PGT that needs to be added along with the GT for training a CNN. In this regard, we explore three aspects of PGT which influence the learning of a CNN: (ⅰ) the PGT labeling has to be of good quality; (ⅱ) the PGT images have to be different compared to the GT images; (ⅲ) the PGT has to be trusted differently than GT. We conclude that PGT which is diverse from GT images and has good quality of labeling can indeed help improve the performance of a CNN. Also, when PGT is multiple folds larger than GT, weighing down the trust on PGT helps in improving the accuracy. Finally, We show that using PGT along with GT, the performance of Fully Convolutional Network (FCN) on Camvid data is increased by 2.7% on IoU accuracy. We believe such an approach can be used to train CNNs for semantic video segmentation where sequentially labeled image frames are needed. To this end, we provide recommendations for using PGT strategically for semantic segmentation and hence bypass the need for extensive human efforts in labeling.

机译：对于最先进的语义分割任务，培训卷积神经网络（CNNS）需要密集的像素地面真理（GT）标签，这是昂贵的并且涉及广泛的人类努力。在这项工作中，我们研究了使用辅助地面真理，所谓的伪原理（PGT）的可能性来提高性能。使用简单的基于CRF的CUE积分框架将GT帧的标签传播到视频中的后续帧来获得PGT。我们的主要贡献是展示使用嘈杂的PGT以及GT来提高CNN的性能。我们执行系统分析以找到需要添加的合适的PGT，以及用于训练CNN的GT。在这方面，我们探讨了影响CNN的学习的PGT的三个方面：（Ⅰ）PGT标签必须具有良好的质量; （Ⅱ）与GT图像相比，PGT图像必须不同; （Ⅲ）PGT必须与GT相比不同。我们得出结论，从GT图像中多样化的PGT并具有良好质量的标签可以确实有助于提高CNN的性能。此外，当PGT多于GT的多个折叠时，称量对PGT的信任有助于提高准确性。最后，我们表明，使用PGT以及GT，CAMVID数据上的完全卷积网络（FCN）的性能增加了2.7％的IOU精度。我们认为这种方法可用于训练CNNS用于需要顺序标记的图像帧的语义视频分段。为此，我们为战略性地为语义分割提供了建议，并因此绕过了在标签中进行广泛的人类努力。

著录项

来源
《European conference on computer vision》|2016年|xxiii 919 p.|共17页
会议地点
作者
Siva Karthik Mustikovela; Michael Ying Yang; Carsten Rother;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工）;
关键词

相似文献

外文文献
中文文献
专利

1. Semantic Object Classes In Video: A High-definition Ground Truth Database [J] . Gabriel J. Brostow, Julien Fauqueur, Roberto Cipolla Pattern recognition letters . 2009,第2期

机译：视频中的语义对象类：高清地面真相数据库
2. Objective Performance Evaluation of Video Segmentation Algorithms with Ground-Truth [J] . YANG Gao-bo, ZHANG Zhao-yang Journal of Shanghai University . 2004,第1期

机译：具有地面真实性的视频分割算法的客观性能评估
3. Objective Performance Evaluation of Video Segmentation Algorithms with Ground-Truth [J] . 杨高波, 张兆扬上海大学学报：英文版 . 2004,第001期

机译：具有地面真实性的视频分割算法的客观性能评估
4. Can Ground Truth Label Propagation from Video Help Semantic Segmentation? [C] . Siva Karthik Mustikovela, Michael Ying Yang, Carsten Rother European conference on computer vision . 2016

机译：视频中的地面真相标签传播是否可以帮助语义分割？
5. Using objective ground-truth labels created by multiple annotators for improved video classification: A comparative study. [D] . Srivastava, Gaurav. 2012

机译：使用由多个注释器创建的客观的真实标签来改善视频分类：一项比较研究。
6. Auto-segmentations by convolutional neural network in cervical and anorectal cancer with clinical structure sets as the ground truth [O] . Hanna Sartor, David Minarik, Olof Enqvist, 2020

机译：临床结构宫颈和肛肠癌中的卷积神经网络自动分割临床结构作为地面真理
7. Semantics Through Time: Semi-supervised Segmentation of Aerial Videos with Iterative Label Propagation [O] . Alina Marcu, Vlad Licaret, Dragos Costea, 2021

机译：通过时间的语义：具有迭代标签传播的空中视频的半监督分割

Can Ground Truth Label Propagation from Video Help Semantic Segmentation?

摘要

著录项

相似文献

相关主题

期刊订阅