首页> 外文会议>IEEE International Conference on Image Processing >Guidance And Teaching Network For Video Salient Object Detection
【24h】

Guidance And Teaching Network For Video Salient Object Detection

机译:视频突出对象检测的指导和教学网络

获取原文

摘要

Owing to the difficulties of mining spatial-temporal cues, the existing approaches for video salient object detection (VSOD) are limited in understanding complex and noisy scenarios, and often fail in inferring prominent objects. To alleviate such shortcomings, we propose a simple yet efficient architecture, termed Guidance and Teaching Network (GTNet), to independently distill effective spatial and temporal cues with implicit guidance and explicit teaching at feature- and decision-level, respectively. To be specific, we (a) introduce a temporal modulator to implicitly bridge features from motion into appearance branch, which is capable of fusing cross-modal features collaboratively, and (b) utilise motion-guided mask to propagate the explicit cues during the feature aggregation. This novel learning strategy achieves satisfactory results via decoupling the complex spatial-temporal cues and mapping informative cues across different modalities. Extensive experiments on three challenging benchmarks show that the proposed method can run at$sim$28 fps on a single TITAN Xp GPU and perform competitively against 14 cutting-edge baselines.
机译:由于采矿空间时间线索的困难,视频突出物体检测(VSOD)的现有方法受到限制,在理解复杂和嘈杂的场景,并且通常在推断突出的对象中失败。为了减轻这种缺点,我们提出了一个简单而有效的架构,被称为指导和教学网络(GTNET),分别独立地蒸馏出有效的空间和时间线索,并分别在特征和决策级别进行了隐含的指导和明确教学。具体而言,我们(a)将时间调制器引入临时调制器,以隐式桥接到外观分支的特征,其能够融合协作的跨模型特征,并且(b)利用运动引导掩模在该特征期间使用运动引导掩模在特征期间传播显式提示。聚合。这种新颖的学习策略通过解耦了复杂的空间时间线索和绘制了跨不同模式的信息提示的令人满意的结果。在三个具有挑战性的基准测试中的广泛实验表明,该方法可以在单个Titan XP GPU上以$ SIM $ 28 FPS运行,并竞争地对抗14个尖端基线。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号