TASED-Net: Temporally-Aggregating Spatial Encoder-Decoder Network for Video Saliency Detection

机译：TASED-Net：用于视频显着性检测的临时聚合空间编解码器网络

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

TASED-Net is a 3D fully-convolutional network architecture for video saliency detection. It consists of two building blocks: first, the encoder network extracts low-resolution spatiotemporal features from an input clip of several consecutive frames, and then the following prediction network decodes the encoded features spatially while aggregating all the temporal information. As a result, a single prediction map is produced from an input clip of multiple frames. Frame-wise saliency maps can be predicted by applying TASED-Net in a sliding-window fashion to a video. The proposed approach assumes that the saliency map of any frame can be predicted by considering a limited number of past frames. The results of our extensive experiments on video saliency detection validate this assumption and demonstrate that our fully-convolutional model with temporal aggregation method is effective. TASED-Net significantly outperforms previous state-of-the-art approaches on all three major large-scale datasets of video saliency detection: DHF1K, Hollywood2, and UCFSports. After analyzing the results qualitatively, we observe that our model is especially better at attending to salient moving objects.

机译：TASED-Net是用于视频显着性检测的3D全卷积网络体系结构。它由两个组成部分组成：首先，编码器网络从几个连续帧的输入剪辑中提取低分辨率的时空特征，然后，随后的预测网络在汇总所有时间信息的同时，对已编码的特征进行空间解码。结果，从多个帧的输入剪辑产生单个预测图。通过以滑动窗口方式将TASED-Net应用于视频，可以预测逐帧显着性图。所提出的方法假设可以通过考虑有限数量的过去帧来预测任何帧的显着性图。我们在视频显着性检测方面的大量实验结果验证了这一假设，并证明了采用时间聚合方法的全卷积模型是有效的。在视频显着性检测的所有三个主要大型数据集上，TASED-Net的性能均大大优于以前的最新方法：DHF1K，Hollywood2和UCFSports。在对结果进行定性分析后，我们观察到我们的模型特别适合于突出运动物体。

著录项

来源
《International Conference on Computer Vision》|2019年|2394-2403|共10页
会议地点 Seoul(KR)
作者
Kyle Min; Jason Corso;
展开▼
作者单位

University of Michigan;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Feature extraction; Three-dimensional displays; Saliency detection; Decoding; Spatiotemporal phenomena; Two dimensional displays; Convolution;

机译：特征提取;三维显示器；显着性检测；解码;时空现象；二维显示；卷积;

相似文献

外文文献
中文文献
专利

1. Multi-scale deep encoder-decoder network for salient object detection [J] . Ren Qinghua, Hu Renjie Neurocomputing . 2018,第NOVa17期

机译：用于大规模目标物检测的多尺度深度编码器-解码器网络
2. Automatic detection of salient objects and spatial relations in videos for a video database system [J] . Tarkan Sevilmis, Muhammet Bastan, Ugur Gueduekbay, Image and Vision Computing . 2008,第10期

机译：视频数据库系统中视频中的显着物体和空间关系的自动检测
3. Video salient object detection via spatiotemporal attention neural networks [J] . Tang Yi, Zou Wenbin, Hua Yang, Neurocomputing . 2020,第Feba15期

机译：时空注意神经网络的视频显着目标检测
4. TASED-Net: Temporally-Aggregating Spatial Encoder-Decoder Network for Video Saliency Detection [C] . Kyle Min, Jason Corso International Conference on Computer Vision . 2019

机译：TASED-NET：用于视频显着性检测的时间汇总空间编码器 - 解码器网络
5. Towards accurate group activity analysis in videos: Robust saliency detection and effective feature modeling. [D] . Cui, Xinyi. 2013

机译：进行视频中准确的小组活动分析：强大的显着性检测和有效的特征建模。
6. Automatic Detection of the Pharyngeal Phase in Raw Videos for the Videofluoroscopic Swallowing Study Using Efficient Data Collection and 3D Convolutional Networks [O] . Jong Taek Lee, Eunhee Park, Tae-Du Jung 2019

机译：使用有效的数据收集和3D卷积网络自动检测原始视频中的咽相以便进行视频荧光吞咽研究
7. TASED-Net: Temporally-Aggregating Spatial Encoder-Decoder Network for Video Saliency Detection [O] . Kyle Min, Jason Corso 2019

机译：TASED-NET：用于视频显着性检测的时间汇总空间编码器 - 解码器网络

TASED-Net: Temporally-Aggregating Spatial Encoder-Decoder Network for Video Saliency Detection

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅