【24h】

Label propagation in RGB-D video

机译:RGB-D视频中的标签传播

获取原文

摘要

We propose a new method for the propagation of semantic labels in RGB-D video of indoor scenes given a set of ground truth keyframes. Manual labeling of all pixels in every frame of a video sequence is labor intensive and costly, yet required for training and testing of semantic segmentation methods. The availability of video enables propagation of labels between the frames for obtaining a large amounts of annotated pixels. While previous methods commonly used optical flow motion cues for label propagation, we present a novel approach using the camera poses and 3D point clouds for propagating the labels in superpixels computed on the unannotated frames of the sequence. The propagation task is formulated as an energy minimization problem in a Conditional Random Field (CRF). We performed experiments on 8 video sequences from SUN3D dataset [1] and showed superior performance to an optical flow based label propagation approach. Furthermore, we demonstrated that the propagated labels can be used to learn better models using data hungry deep convolutional neural network (DCNN) based approaches for the task of semantic segmentation. The approach demonstrates an increase in performance when the ground truth keyframes are combined with the propagated labels during training.
机译:我们提出了一种在给定地面真实关键帧的情况下在室内场景的RGB-D视频中传播语义标签的新方法。手动标记视频序列每一帧中的所有像素是费力且昂贵的,但是对于语义分割方法的训练和测试却是必需的。视频的可用性使标签在帧之间传播,以获得大量带注释的像素。尽管以前的方法通常使用光流运动线索进行标签传播,但我们提出了一种使用相机姿态和3D点云的新方法,用于在序列的未注释帧上计算的超像素中传播标签。传播任务被公式化为条件随机场(CRF)中的能量最小化问题。我们对SUN3D数据集的8个视频序列进行了实验[1],并显示了优于基于光流的标签传播方法的性能。此外,我们证明了传播的标签可用于基于数据饥饿的深度卷积神经网络(DCNN)的方法用于语义分割的学习更好的模型。当在训练过程中将地面真实关键帧与传播的标签组合在一起时,该方法证明了性能的提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号