首页> 外文会议>International Conference on Computer Vision >Video Object Segmentation Using Space-Time Memory Networks
【24h】

Video Object Segmentation Using Space-Time Memory Networks

机译:使用时空存储网络的视频对象分割

获取原文

摘要

We propose a novel solution for semi-supervised video object segmentation. By the nature of the problem, available cues (e.g. video frame(s) with object masks) become richer with the intermediate predictions. However, the existing methods are unable to fully exploit this rich source of information. We resolve the issue by leveraging memory networks and learn to read relevant information from all available sources. In our framework, the past frames with object masks form an external memory, and the current frame as the query is segmented using the mask information in the memory. Specifically, the query and the memory are densely matched in the feature space, covering all the space-time pixel locations in a feed-forward fashion. Contrast to the previous approaches, the abundant use of the guidance information allows us to better handle the challenges such as appearance changes and occlussions. We validate our method on the latest benchmark sets and achieved the state-of-the-art performance (overall score of 79.4 on Youtube-VOS val set, J of 88.7 and 79.2 on DAVIS 2016/2017 val set respectively) while having a fast runtime (0.16 second/frame on DAVIS 2016 val set).
机译:我们提出了一种半监督视频对象分割的新颖解决方案。根据问题的性质,可用的提示(例如带有对象蒙版的视频帧)随着中间预测而变得更加丰富。但是,现有方法无法完全利用这种丰富的信息源。我们通过利用内存网络来解决此问题,并学习从所有可用资源中读取相关信息。在我们的框架中,带有对象掩码的过去帧构成了一个外部存储器,而查询中的当前帧是使用存储器中的掩码信息进行分段的。具体而言,查询和内存在特征空间中紧密匹配,以前馈方式覆盖了所有时空像素位置。与以前的方法相比,指导信息的大量使用使我们能够更好地应对诸如外观变化和咬合等挑战。我们在最新的基准测试集上验证了我们的方法,并获得了最先进的性能(Youtube-VOS val集合的总得分为79.4,DAVIS 2016/2017 val集合的总得分分别为88.7和79.2)运行时间(在DAVIS 2016 val设置上为0.16秒/帧)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号