首页> 外文期刊>Future generation computer systems >Learning multiple instance deep quality representation for robust object tracking
【24h】

Learning multiple instance deep quality representation for robust object tracking

机译:学习多个实例的强大对象跟踪的深层质量表示

获取原文
获取原文并翻译 | 示例

摘要

Robustly tracking various objects within a video stream with complex objects and backgrounds is a useful technique in next generation computer vision systems. However, in practice, it is difficult to design a successful video-based object tracking system due to the varied light conditions, possible occlusions, and fast-moving objects. In this work, a novel weakly-supervised and quality-guided visual object tracking model is proposed, wherein the key is a bidirectional long short-term memory recurrent neural network (BLSTM-RNN) that captures the feature sequence and predicts the quality score of each candidate window. More specifically, given a rich set of training videos annotated with the target objects, a weakly-supervised learning algorithm is first used to project all the candidate window features onto the semantic space. Next, we propose a two-stage algorithm to select the key frames from the video sequences, where both the shallow and deep filtering operations are conducted. Subsequently, the so-called BLSTM-RNN is proposed to characterize the feature sequence temporally, based on which the maximally possible object window can be calculated and finally output. In our experiment, a large video dataset containing 2045 NBA regular seasons and playoff basketball games was compiled. Based on this, a comparative study is conducted between the proposed algorithm and state-of-the-art video tracking methods. Extensive visualization results and comparative tracking precisions show the competitiveness of the proposed method.
机译:强大地跟踪具有复杂对象和背景的视频流中的各种对象是下一代计算机视觉系统中的有用技术。然而,在实践中,由于变化的光线,可能的闭塞和快速移动物体,难以设计成功的基于视频的对象跟踪系统。在这项工作中,提出了一种新颖的弱监督和质量引导的视觉对象跟踪模型,其中密钥是双向短期内存经常性神经网络(BLSTM-RNN),捕获特征序列并预测质量得分每个候选窗口。更具体地,给定具有目标对象的丰富的培训视频,首先使用弱监督的学习算法将所有候选窗口功能投影到语义空间上。接下来,我们提出了一种两级算法来选择来自视频序列的关键帧,其中进行浅和深度过滤操作。随后,提出所谓的BLSTM-RNN来在时间上表征特征序列,基于可以计算出最大可能的对象窗口并最终输出。在我们的实验中,编制了一个包含2045 NBA常规季节和季后赛篮球比赛的大型视频数据集。基于此,在所提出的算法和最先进的视频跟踪方法之间进行比较研究。广泛的可视化结果和比较跟踪精度表明了所提出的方法的竞争力。

著录项

  • 来源
    《Future generation computer systems》 |2020年第12期|298-303|共6页
  • 作者单位

    School of Business Administration Cuangxi University of Finance and Economics No. 189 Daxuexi Road Xixiangtang District Nanning Guangxi 530007 China;

    School of Business Administration Cuangxi University of Finance and Economics No. 189 Daxuexi Road Xixiangtang District Nanning Guangxi 530007 China;

    School of Business Administration Cuangxi University of Finance and Economics No. 189 Daxuexi Road Xixiangtang District Nanning Guangxi 530007 China;

    School of Economics and Management Dongguan University of Technology No. 1 Daxue Road Songshan Lake Dongguan Guangdong 523808 China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Visual object tracking; Quality model; Bidirectional LSTM; Weakly-supervised; Spatial temporal modeling;

    机译:视觉对象跟踪;质量模型;Bidirectional LSTM;弱监督;空间时间建模;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号