首页> 外文会议>IEEE/RSJ International Conference on Intelligent Robots and Systems >Adversarially Robust Policy Learning: Active construction of physically-plausible perturbations
【24h】

Adversarially Robust Policy Learning: Active construction of physically-plausible perturbations

机译:对面强大的政策学习:积极建设物理合理的扰动

获取原文

摘要

Policy search methods in reinforcement learning have demonstrated success in scaling up to larger problems beyond toy examples. However, deploying these methods on real robots remains challenging due to the large sample complexity required during learning and their vulnerability to malicious intervention. We introduce Adversarially Robust Policy Learning (ARPL), an algorithm that leverages active computation of physically-plausible adversarial examples during training to enable robust policy learning in the source domain and robust performance under both random and adversarial input perturbations. We evaluate ARPL on four continuous control tasks and show superior resilience to changes in physical environment dynamics parameters and environment state as compared to state-of-the-art robust policy learning methods. Code, data, and additional experimental results are available at: stanfordvl.github.io/ARPL.
机译:强化学习中的政策搜索方法在扩大到玩具示例之外的更大问题方面表现出了成功。但是,由于学习期间所需的大量样本复杂性以及对恶意干预的脆弱性所需的大量复杂性,部署了这些方法仍然具有挑战性。我们介绍了对抗性强大的策略学习(ARPL),该算法利用了训练期间利用物理合理的对手示例的活跃计算,以使源域中的鲁棒策略学习和在随机和普发的输入扰动下的鲁棒性能。我们在四个连续控制任务中评估ARPL,并显示出与最先进的强大的策略学习方法相比,对物理环境动态参数和环境状态的变化显示出优异的弹性。代码,数据和其他实验结果可用于:Stanfordvl.github.io/arpl。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号