Guidance And Teaching Network For Video Salient Object Detection

机译：视频突出对象检测的指导和教学网络

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Owing to the difficulties of mining spatial-temporal cues, the existing approaches for video salient object detection (VSOD) are limited in understanding complex and noisy scenarios, and often fail in inferring prominent objects. To alleviate such shortcomings, we propose a simple yet efficient architecture, termed Guidance and Teaching Network (GTNet), to independently distill effective spatial and temporal cues with implicit guidance and explicit teaching at feature- and decision-level, respectively. To be specific, we (a) introduce a temporal modulator to implicitly bridge features from motion into appearance branch, which is capable of fusing cross-modal features collaboratively, and (b) utilise motion-guided mask to propagate the explicit cues during the feature aggregation. This novel learning strategy achieves satisfactory results via decoupling the complex spatial-temporal cues and mapping informative cues across different modalities. Extensive experiments on three challenging benchmarks show that the proposed method can run at$sim$28 fps on a single TITAN Xp GPU and perform competitively against 14 cutting-edge baselines.

机译：由于采矿空间时间线索的困难，视频突出物体检测（VSOD）的现有方法受到限制，在理解复杂和嘈杂的场景，并且通常在推断突出的对象中失败。为了减轻这种缺点，我们提出了一个简单而有效的架构，被称为指导和教学网络（GTNET），分别独立地蒸馏出有效的空间和时间线索，并分别在特征和决策级别进行了隐含的指导和明确教学。具体而言，我们（a）将时间调制器引入临时调制器，以隐式桥接到外观分支的特征，其能够融合协作的跨模型特征，并且（b）利用运动引导掩模在该特征期间使用运动引导掩模在特征期间传播显式提示。聚合。这种新颖的学习策略通过解耦了复杂的空间时间线索和绘制了跨不同模式的信息提示的令人满意的结果。在三个具有挑战性的基准测试中的广泛实验表明，该方法可以在单个Titan XP GPU上以$ SIM $ 28 FPS运行，并竞争地对抗14个尖端基线。

著录项

来源
《IEEE International Conference on Image Processing》|2021年|2199-2203|共5页
会议地点
作者
Yingxia Jiao; Xiao Wang; Yu-Cheng Chou; Shouyuan Yang; Ge-Peng Ji; Rong Zhu; Ge Gao;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Bridges; Image processing; Conferences; Education; Modulation; Graphics processing units; Object detection;

机译：桥梁;图像处理;会议;教育;调制;图形处理单元;对象检测;

相似文献

外文文献
中文文献
专利

1. Video salient object detection via spatiotemporal attention neural networks [J] . Tang Yi, Zou Wenbin, Hua Yang, Neurocomputing . 2020,第Feba15期

机译：时空注意神经网络的视频显着目标检测
2. Salient object detection in video using deep non-local neural networks [J] . Shokri Mohammad, Harati Ahad, Taba Kimya Journal of visual communication & image representation . 2020,第Apra期

机译：使用深非局部神经网络的视频中突出的对象检测
3. Discovering salient objects from videos using spatiotemporal salient region detection [J] . Kannan Rajkumar, Ghinea Gheorghita, Swaminathan Sridhar Signal Processing. Image Communication: A Publication of the the European Association for Signal Processing . 2015,第Null期

机译：使用时空显着区域检测从视频中发现显着对象
4. Dual-Stream Network Based On Global Guidance for Salient Object Detection [C] . Shuyong Gao, Qianyu Guo, Wei Zhang, IEEE International Conference on Acoustics, Speech and Signal Processing . 2021

机译：基于全局指导的双流网络突出对象检测
5. Saliency Cut: an Automatic Approach for Video Object Segmentation Based on Saliency Energy Minimization [D] . Wang, Yilin 2013

机译：显着削减：一种基于显着能量最小化的视频对象自动分割方法
6. On the Distribution of Salient Objects in Web Images and Its Influence on Salient Object Detection [O] . Boris Schauerte, Rainer Stiefelhagen -1

机译：Web图像中显着对象的分布及其对显着对象检测的影响
7. Video Salient Object Detection via Fully Convolutional Networks [O] . Wang Wenguan, Shen Jianbing, Shao Ling 2018

机译：通过全卷积网络进行视频显着目标检测

Guidance And Teaching Network For Video Salient Object Detection

摘要

著录项

相似文献

相关主题

期刊订阅