首页> 外文会议>IEEE/RSJ International Conference on Intelligent Robots and Systems >Adversarially Robust Policy Learning: Active construction of physically-plausible perturbations
【24h】

Adversarially Robust Policy Learning: Active construction of physically-plausible perturbations

机译:对抗性强健的政策学习:积极构建身体上似乎合理的扰动

获取原文

摘要

Policy search methods in reinforcement learning have demonstrated success in scaling up to larger problems beyond toy examples. However, deploying these methods on real robots remains challenging due to the large sample complexity required during learning and their vulnerability to malicious intervention. We introduce Adversarially Robust Policy Learning (ARPL), an algorithm that leverages active computation of physically-plausible adversarial examples during training to enable robust policy learning in the source domain and robust performance under both random and adversarial input perturbations. We evaluate ARPL on four continuous control tasks and show superior resilience to changes in physical environment dynamics parameters and environment state as compared to state-of-the-art robust policy learning methods. Code, data, and additional experimental results are available at: stanfordvl.github.io/ARPL.
机译:强化学习中的策略搜索方法已经证明可以成功地解决玩具示例以外的更大问题。但是,由于在学习过程中需要大量的样本复杂性以及它们容易受到恶意干预,因此在实际的机器人上部署这些方法仍然具有挑战性。我们引入了对抗鲁棒性策略学习(ARPL),该算法在训练过程中利用对物理上可行的对抗性示例的主动计算来实现源域中的鲁棒性策略学习以及在随机和对抗性输入扰动下均具有鲁棒的性能。我们评估了四个连续控制任务上的ARPL,与最新的鲁棒性策略学习方法相比,它显示了对物理环境动力学参数和环境状态变化的出色恢复能力。代码,数据和其他实验结果可在以下位置获得:stanfordvl.github.io/ARPL。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号