首页> 外文期刊>Image and Vision Computing >Fully automatic person segmentation in unconstrained video using spatio-temporal conditional random fields
【24h】

Fully automatic person segmentation in unconstrained video using spatio-temporal conditional random fields

机译:使用时空条件随机场在不受约束的视频中进行全自动人分割

获取原文
获取原文并翻译 | 示例

摘要

The segmentation of objects and people in particular is an important problem in computer vision. In this paper, we focus on automatically segmenting a person from challenging video sequences in which we place no constraint on camera viewpoint, camera motion or the movements of a person in the scene. Our approach uses the most confident predictions from a pose detector as a form of anchor or keyframe stick figure prediction which helps guide the segmentation of other more challenging frames in the video. Since even state of the art pose detectors are unreliable on many frames especially given that we are interested in segmentations with no camera or motion constraints only the poses or stick figure predictions for frames with the highest confidence in a localized temporal region anchor further processing. The stick figure predictions within confident keyframes are used to extract color, position and optical flow features. Multiple conditional random fields (CRFs) are used to process blocks of video in batches, using a two dimensional CRF for detailed keyframe segmentation as well as 3D CRFs for propagating segmentations to the entire sequence of frames belonging to batches. Location information derived from the pose is also used to refine the results. Importantly, no hand labeled training data is required by our method. We discuss the use of a continuity method that reuses learnt parameters between batches of frames and show how pose predictions can also be improved by our model. We provide an extensive evaluation of our approach, comparing it with a variety of alternative grab cut based methods and a prior state of the art method. We also release our evaluation data to the community to facilitate further experiments. We find that our approach yields state of the art qualitative and quantitative performance compared to prior work and more heuristic alternative approaches. (C) 2016 Elsevier B.V. All rights reserved.
机译:特别是对象和人的分割是计算机视觉中的重要问题。在本文中,我们专注于从具有挑战性的视频序列中自动分割人,在这些视频序列中,我们对摄像机视点,摄像机运动或场景中人的运动没有任何限制。我们的方法将姿势检测器中最有把握的预测用作锚点或关键帧棒图预测的形式,这有助于指导视频中其他更具挑战性的帧的分割。由于即使是最先进的姿态检测器,在许多帧上也不可靠,特别是考虑到我们对没有相机或运动约束的分割感兴趣的情况下,只有局部时间区域中具有最高置信度的帧的姿态或棒图预测会锚定进一步的处理。置信关键帧内的棒图预测用于提取颜色,位置和光流特征。多个条件随机字段(CRF)用于批量处理视频块,其中使用二维CRF进行详细的关键帧分割,而使用3D CRF进行分段传播到属于批处理的整个帧序列。从姿势得出的位置信息也用于完善结果。重要的是,我们的方法不需要手工标记的训练数据。我们讨论了一种连续性方法的使用,该方法在批次的帧之间重用学习到的参数,并展示了如何通过我们的模型来改善姿势预测。我们对我们的方法进行了广泛的评估,将其与各种基于替代抓斗的方法和现有技术进行了比较。我们还将评估数据发布给社区,以促进进一步的实验。我们发现,与先前的工作和更具启发性的替代方法相比,我们的方法在定性和定量性能方面达到了最先进的水平。 (C)2016 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号