首页> 外文期刊>IEEE Transactions on Image Processing >Spatiotemporal Knowledge Distillation for Efficient Estimation of Aerial Video Saliency
【24h】

Spatiotemporal Knowledge Distillation for Efficient Estimation of Aerial Video Saliency

机译:用于高效估计空中视频显着性的时空知识蒸馏

获取原文
获取原文并翻译 | 示例

摘要

The performance of video saliency estimation techniques has achieved significant advances along with the rapid development of Convolutional Neural Networks (CNNs). However, devices like cameras and drones may have limited computational capability and storage space so that the direct deployment of complex deep saliency models becomes infeasible. To address this problem, this paper proposes a dynamic saliency estimation approach for aerial videos via spatiotemporal knowledge distillation. In this approach, five components are involved, including two teachers, two students and the desired spatiotemporal model. The knowledge of spatial and temporal saliency is first separately transferred from the two complex and redundant teachers to their simple and compact students, while the input scenes are also degraded from high-resolution to low-resolution to remove the probable data redundancy so as to greatly speed up the feature extraction process. After that, the desired spatiotemporal model is further trained by distilling and encoding the spatial and temporal saliency knowledge of two students into a unified network. In this manner, the inter-model redundancy can be removed for the effective estimation of dynamic saliency on aerial videos. Experimental results show that the proposed approach is comparable to 11 state-of-the-art models in estimating visual saliency on aerial videos, while its speed reaches up to 28,738 FPS and 1,490.5 FPS on the GPU and CPU platforms, respectively.
机译:视频显着估计技术的性能随着卷积神经网络(CNNS)的快速发展而达成了重要进步。然而,像相机和无人机这样的设备可能具有有限的计算能力和存储空间,使得复杂的深度显着模型的直接部署变得不可行。为了解决这个问题,本文提出了通过时空知识蒸馏的空中视频动态显着性估算方法。在这种方法中,涉及五个组成部分,包括两位教师,两个学生和所需的时空模型。首先将空间和时间显着性的知识从两个复杂和冗余教师单独转移到他们简单和紧凑的学生,而输入场景也从高分辨率降低到低分辨率以消除可能的数据冗余,以便大大加快特征提取过程。之后,通过蒸馏和编码两个学生的空间和时间显着知识进入统一网络,进一步训练所需的时空模型。以这种方式,可以删除模型间冗余,以便有效估计在空中视频上的动态显着性。实验结果表明,该方法可与11估计空中视频的视觉显着性相比,其速度分别在GPU和CPU平台上达到高达28,738 FPS和1,490.5 FPS。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号