...
首页> 外文期刊>IEEE Transactions on Pattern Analysis and Machine Intelligence >AD-VAT+: An Asymmetric Dueling Mechanism for Learning and Understanding Visual Active Tracking
【24h】

AD-VAT+: An Asymmetric Dueling Mechanism for Learning and Understanding Visual Active Tracking

机译:Ad-Vat +:一种用于学习和理解视觉主动跟踪的不对称决斗机制

获取原文
获取原文并翻译 | 示例
           

摘要

Visual Active Tracking (VAT) aims at following a target object by autonomously controlling the motion system of a tracker given visual observations. To learn a robust tracker for VAT, in this article, we propose a novel adversarial reinforcement learning (RL) method which adopts an Asymmetric Dueling mechanism, referred to as AD-VAT. In the mechanism, the tracker and target, viewed as two learnable agents, are opponents and can mutually enhance each other during the dueling/competition: i.e., the tracker intends to lockup the target, while the target tries to escape from the tracker. The dueling is asymmetric in that the target is additionally fed with the tracker's observation and action, and learns to predict the tracker's reward as an auxiliary task. Such an asymmetric dueling mechanism produces a stronger target, which in turn induces a more robust tracker. To improve the performance of the tracker in the case of challenging scenarios such as obstacles, we employ more advanced environment augmentation technique and two-stage training strategies, termed as AD-VAT+. For a better understanding of the asymmetric dueling mechanism, we also analyze the target's behaviors as the training proceeds and visualize the latent space of the tracker. The experimental results, in both 2D and 3D environments, demonstrate that the proposed method leads to a faster convergence in training and yields more robust tracking behaviors in different testing scenarios. The potential of the active tracker is also shown in real-world videos.
机译:通过自主地控制追踪性的视觉观测,可视激活跟踪(VAT)旨在遵循目标对象。要学习增值税的强大追踪器,在本文中,我们提出了一种新的对抗性强化学习(RL)方法,采用不对称的决斗机制,称为AD-VAT。在该机制中,被视为两个学习代理的机制,追踪器和目标是对手,并且可以在决斗/竞争期间相互互相增强:即,跟踪器打算锁定目标,而目标试图从跟踪器逃离。 Dueling是不对称的,因为该目标另外喂食跟踪器的观察和行动,并学会预测跟踪器作为辅助任务的奖励。这种不对称的决斗机构产生更强的目标,其又引起更强大的跟踪器。为了提高跟踪器的表现,在障碍等障碍的具有挑战性的情况下,我们采用更先进的环境增强技术和两级训练策略,称为AD-VAT +。为了更好地了解不对称的决斗机制,我们还将目标的行为分析,因为培训进行并可视化跟踪器的潜在空间。在2D和3D环境中,实验结果表明,所提出的方法导致训练中的速度更快,并在不同的测试场景中产生更强大的跟踪行为。活动跟踪器的潜力也显示在现实世界中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号