首页> 外文会议>International Conference on Neural Information Processing >Motion-Based Occlusion-Aware Pixel Graph Network for Video Object Segmentation
【24h】

Motion-Based Occlusion-Aware Pixel Graph Network for Video Object Segmentation

机译:基于运动的遮挡感知像素图形网络用于视频对象分段

获取原文
获取外文期刊封面目录资料

摘要

This paper proposes a dual-channel based Graph Convolutional Network (GCN) for the Video Object Segmentation (VOS) task. The main contribution lies in formulating two pixel graphs based on the raw RGB and optical flow features. Both spatial and temporal features are learned independently, making the network robust to various challenging scenarios in real-world videos. Additionally, a motion orientation-based aggregator scheme efficiently captures long-range dependencies among objects. This not only deals with the complex issue of modelling velocity differences among multiple objects moving in various directions, but also adapts to change of appearance of objects due to pose and scale deformations. Also, an occlusion-aware attention mechanism has been employed to facilitate accurate segmentation under scenarios where multiple objects have temporal discontinuity in their appearance due to occlusion. Performance analysis on DAVIS-2016 and DAVIS-2017 datasets show the effectiveness of our proposed method in foreground segmentation of objects in videos over the existing state-of-the-art techniques. Control experiments using CamVid dataset show the generalising capability of the model for scene segmentation.
机译:本文提出了一种用于视频对象分段(VOS)任务的基于双通道的图形卷积网络(GCN)。主要贡献在于基于原始RGB和光流特征在制定两个像素图形。空间和时间特征都是独立学习的,使网络对现实世界视频中的各种具有挑战性的情景进行了鲁棒。另外,基于运动方向的聚合器方案有效地捕获对象之间的远程依赖性。这不仅涉及在各种方向上移动的多个物体之间建模速度差异的复杂问题,而且还适应由于姿势和比例变形而改变物体的外观。此外,已经采用了一种遮挡感知的注意机制来促进根据闭塞由于其外观存在时间不连续的情况下的精确细分。 Davis-2016和Davis-2017数据集的性能分析显示了我们在现有最先进技术中的视频中对象的前景分段中提出的方法的有效性。使用Camvid DataSet的控制实验显示了场景分割模型的推广能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号