...
首页> 外文期刊>Neural computing & applications >Hierarchical attentive Siamese network for real-time visual tracking
【24h】

Hierarchical attentive Siamese network for real-time visual tracking

机译:用于实时视觉跟踪的分层细心暹罗网络

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Visual tracking is a fundamental and highly useful component in various tasks of computer vision. Recently, end-to-end off-line training Siamese networks have demonstrated great success in visual tracking with high performance in terms of speed and accuracy. However, Siamese trackers usually employ visual features from the last simple convolutional layers to represent the targets while ignoring the fact that features from different layers characterize different representation capabilities of the targets, and hence this may degrade tracking performance in the presence of severe deformation and occlusion. In this paper, we present a novel hierarchical attentive Siamse (HASiam) network for high-performance visual tracking, which exploits different kinds of attention mechanisms to effectively fuse a series of attentional features from different layers. More specifically, we combine a deeper network with a shallow one to take full advantage of the features from different layers and apply spatial and channel-wise attentions on different layers to better capture visual attentions on multi-level semantic abstractions, which is helpful to enhance the discriminative capacity of the model. Furthermore, the top-layer feature maps have low resolution that may affect localization accuracy if each feature is treated independently. To address this issue, a non-local attention module is also adopted on the top layer to force the network to pay more attention to the structural dependency of features at all locations during off-line training. The proposed HASiam is trained off-line in an end-to-end manner and needs no online updating the network parameters during tracking. Extensive evaluations demonstrate that our HASiam has achieved favorable results with AUC scores of64.6%and EAO scores of 0.227 while having a speed of 60 fps on the OTB2013, OTB100 and VOT2017 real-time experiments, respectively. Our tracker with high accuracy and real-time speed can be applied to numerous vision applications like visual surveillance systems, robotics and augmented reality.
机译:视觉跟踪是计算机愿景的各种任务中的基本且非常有用的组成部分。最近,结束的离线训练暹罗网络在速度和准确性方面具有高性能的视觉跟踪,表现出巨大的成功。然而,暹罗跟踪器通常使用来自最后一个简单的卷积层的视觉特征来代表目标,同时忽略不同层的特征的事实表征目标的不同表示能力,因此这可能降低了严重变形和闭塞的情况下的跟踪性能。在本文中,我们为高性能的高性能视线跟踪了一种新的分层细心暹序(HASIAM)网络,利用不同类型的注意机制,从而有效地熔断来自不同层的一系列注意特征。更具体地说,我们将一个深度的网络与浅浅的网络结合起来充分利用不同层的特征,并在不同层上应用空间和通道的注意,以更好地捕捉到多级语义抽象上的视觉关注,这有助于增强模型的辨别能力。此外,如果每个特征独立处理,顶层特征映射可能会影响本地化精度的分辨率。为了解决这个问题,顶层还采用了非本地注意模块,以强制在离线训练期间更加关注所有位置的特征的结构依赖性。拟议的哈利亚姆以端到端的方式离线培训,并且在跟踪期间不需要在线更新网络参数。广泛的评估表明,我们的哈利亚姆已经取得了有利的结果,患有64.6%和0.227的EAO得分的良好结果,同时在OTB2013,OTB100和VOT2017实际实验中具有60 fps的速度。我们具有高精度和实时速度的跟踪器可以应用于视觉监控系统,机器人和增强现实等众多视觉应用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号