...
首页> 外文期刊>IEEE Transactions on Pattern Analysis and Machine Intelligence >Robust Visual Tracking via Hierarchical Convolutional Features
【24h】

Robust Visual Tracking via Hierarchical Convolutional Features

机译:通过分层卷积功能进行可靠的视觉跟踪

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Visual tracking is challenging as target objects often undergo significant appearance changes caused by deformation, abrupt motion, background clutter and occlusion. In this paper, we propose to exploit the rich hierarchical features of deep convolutional neural networks to improve the accuracy and robustness of visual tracking. Deep neural networks trained on object recognition datasets consist of multiple convolutional layers. These layers encode target appearance with different levels of abstraction. For example, the outputs of the last convolutional layers encode the semantic information of targets and such representations are invariant to significant appearance variations. However, their spatial resolutions are too coarse to precisely localize the target. In contrast, features from earlier convolutional layers provide more precise localization but are less invariant to appearance changes. We interpret the hierarchical features of convolutional layers as a nonlinear counterpart of an image pyramid representation and explicitly exploit these multiple levels of abstraction to represent target objects. Specifically, we learn adaptive correlation filters on the outputs from each convolutional layer to encode the target appearance. We infer the maximum response of each layer to locate targets in a coarse-to-fine manner. To further handle the issues with scale estimation and re-detecting target objects from tracking failures caused by heavy occlusion or out-of-the-view movement, we conservatively learn another correlation filter, that maintains a long-term memory of target appearance, as a discriminative classifier. We apply the classifier to two types of object proposals: (1) proposals with a small step size and tightly around the estimated location for scale estimation; and (2) proposals with large step size and across the whole image for target re-detection. Extensive experimental results on large-scale benchmark datasets show that the proposed algorithm performs favorably against the state-of-the-art tracking methods.
机译:目视跟踪具有挑战性,因为目标对象经常会因变形,突然运动,背景混乱和遮挡而导致外观发生重大变化。在本文中,我们建议利用深层卷积神经网络的丰富层次结构特征来提高视觉跟踪的准确性和鲁棒性。在对象识别数据集上训练的深度神经网络由多个卷积层组成。这些层使用不同的抽象级别对目标外观进行编码。例如,最后卷积层的输出对目标的语义信息进行编码,并且此类表示对于显着的外观变化是不变的。但是,它们的空间分辨率太粗糙,无法精确定位目标。相反,来自早期卷积层的特征提供了更精确的定位,但外观变化不变。我们将卷积层的层次特征解释为图像金字塔表示的非线性对应物,并显式地利用这些多个抽象级别来表示目标对象。具体来说,我们在每个卷积层的输出上学习自适应相关滤波器,以对目标外观进行编码。我们推断每层以从粗到精的方式定位目标的最大响应。为了进一步处理比例估计问题,并通过跟踪重度遮挡或视线范围外的运动导致的跟踪故障重新检测目标对象,我们保守地学习了另一个相关过滤器,该过滤器可以长期保存目标外观,如下判别式分类器。我们将分类器应用于两种类型的对象提议:(1)步长较小且紧紧围绕估计位置进行规模估计的提议; (2)具有较大步长且在整个图像上用于目标重新检测的建议。在大型基准数据集上的大量实验结果表明,所提出的算法相对于最新的跟踪方法具有良好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号