首页> 外文期刊>Neurocomputing >MSSTResNet-TLD: A robust tracking method based on tracking-learning-detection framework by using multi-scale spatio-temporal residual network feature model
【24h】

MSSTResNet-TLD: A robust tracking method based on tracking-learning-detection framework by using multi-scale spatio-temporal residual network feature model

机译:MSSTResNet-TLD:一种基于跟踪学习检测框架的健壮跟踪方法,其中使用多尺度时空残差网络特征模型

获取原文
获取原文并翻译 | 示例
           

摘要

The performance of tracking task is directly dependent on the appearance features of target object, a robust approach for constructing appearance features is crucial for adaptation the appearance change. To construct an accurate and robust appearance model for visual object tracking, we modify original deep residual learning network architecture and name it Multi-Scale Residual Network (MSResNet). The first video frame image and its related information of the current input video sequence are used to learn a multi-scale appearance model of target object and a loss function is minimized over the appearance features. Meanwhile, spatial information of each video frame and temporal information between successive video frames effectively combine with MSResNet. And thus the features are generated by MultiScale Spatio-Temporal Residual Network, which is named MSSTResNet feature model, can adapt to scale variation, illumination variation, background clutters, severe deformation of the target object, and so on. We implement a robust tracking method based on tracking-learning-detection framework by using our proposed MSSTResNet feature model and name it MSSTResNet-TLD tracker. Unlike the previous tracking methods, the MSResNet architecture is not offline pre-trained on a large auxiliary datasets but is directly learned end-to-end with a multi-task loss by using the current input video sequence. Furthermore, the multi-task loss function utilizes the classification loss and regression loss that is more accurate for target localization. Our experimental results demonstrate that the proposed tracking method outperforms the current state-of-the-art tracking methods on Visual Object Tracking Benchmark (VOT-2016), Object Tracking Benchmark (OTB-2015), and Unmanned Aerial Vehicles (UAV20L) test datasets. Furthermore, our MSSTResNet-TLD tracker is faster than previous most trackers based on deep Convolutional Neural Network (ConvNet or CNN) and our tracker is extremely robust to the tiny target object. (C) 2019 Elsevier B.V. All rights reserved.
机译:跟踪任务的性能直接取决于目标对象的外观特征,构造外观特征的可靠方法对于适应外观变化至关重要。为了构建用于视觉对象跟踪的准确且健壮的外观模型,我们修改了原始的深度残差学习网络架构,并将其命名为多尺度残差网络(MSResNet)。当前输入视频序列的第一视频帧图像及其相关信息用于学习目标对象的多尺度外观模型,并且在外观特征上使损失函数最小化。同时,每个视频帧的空间信息和连续视频帧之间的时间信息与MSResNet有效结合。因此,由MultiScale时空残差网络生成的特征被称为MSSTResNet特征模型,可以适应尺度变化,光照变化,背景杂波,目标对象的严重变形等。我们使用提出的MSSTResNet特征模型,基于跟踪学习检测框架实现了一种鲁棒的跟踪方法,并将其命名为MSSTResNet-TLD跟踪器。与以前的跟踪方法不同,MSResNet架构不是在大型辅助数据集上进行离线离线预训练的,而是通过使用当前输入视频序列以多任务丢失的方式直接端到端学习的。此外,多任务损失功能利用分类损失和回归损失,这些损失对于目标定位更为准确。我们的实验结果表明,在视觉对象跟踪基准(VOT-2016),对象跟踪基准(OTB-2015)和无人飞行器(UAV20L)测试数据集上,所提出的跟踪方法要优于当前的最新跟踪方法。 。此外,我们的MSSTResNet-TLD跟踪器比以前的大多数基于深度卷积神经网络(ConvNet或CNN)的跟踪器要快,并且我们的跟踪器对微小的目标对象非常强大。 (C)2019 Elsevier B.V.保留所有权利。

著录项

  • 来源
    《Neurocomputing》 |2019年第14期|175-194|共20页
  • 作者单位

    Chongqing Univ Sci & Technol Coll Math & Big Data Chongqing Peoples R China;

    Harbin Inst Technol Shenzhen Grad Sch Sch Comp Sci & Technol Harbin Heilongjiang Peoples R China;

    Chongqing Univ Coll Comp Sci Chongqing Peoples R China;

    China Mobile IOT Co Ltd Open Platform Dept Shenzhen Guangdong Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Tracking-learning-detection; Spatio-temporal feature; Multi-scale feature; Residual network;

    机译:跟踪学习检测;时空特征;多尺度功能;残留网络;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号