AdaPT: Zero-Shot Adaptive Policy Transfer for Stochastic Dynamical Systems

机译：适应：随机动力系统的零射自适应策略转移

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Deep reinforcement learning (RL) has achieved remarkable advances in sequential decision making in recent years, often outperforming humans on tasks such as Atari games. However, model-free variants of deep RL are not directly applicable to physical systems because they exhibit poor sample complexity, often requiring millions of training examples on an accurate model of the environment. One approach to using model-free RL methods on robotic systems is thus to train in a relatively accurate simulator (a source domain), and transfer the policy to the physical robot (a target domain). This naive transfer may, in practice, perform arbitrarily badly and so online fine-tuning may be performed. During this fine-tuning, the robot may behave unsafely however, and so it is desirable for a system to be able to train in a simulator with slight model inaccuracies but still be able to perform well on the target system on the first iteration. We refer to this as the zero-shot policy transfer problem.

机译：深度加强学习（RL）近年来逐渐决策的显着进展，往往优于Atari Games等任务的人类。然而，Deep RL的无模型变体不可用于物理系统，因为它们具有较差的样本复杂性，通常需要数百万训练示例的环境准确的环境。因此，在机器人系统上使用无模型RL方法的方法是在相对准确的模拟器（源域）中训练，并将策略转移到物理机器人（目标域）。在实践中，这种天真的传输可以在实践中进行任意糟糕并且可以执行在线微调。在这种微调期间，机器人可以表现不必要，因此希望一种系统能够在模拟器中训练具有轻微的模型不准确，但仍然能够在第一迭代上对目标系统执行良好。我们将此称为零击策略转移问题。

著录项

来源
《International Symposium on Robotics Research》|2020年|1071p|共17页
会议地点
作者
J. Harrison; A. Garg; B. Ivanovic; Y. Zhu; S. Savarese; L. Fei-Fei; M. Pavone;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP24-53;
关键词

相似文献

外文文献
中文文献
专利

1. Decentralized adaptive tracking control for high-order interconnected stochastic nonlinear time-varying delay systems with stochastic input-to-state stable inverse dynamics by neural networks [J] . Wang Qian, Wang Qiangde, Wei Chunling, Transactions of the Institute of Measurement and Control . 2019,第13期

机译：用于高阶互连的随机非线性时变延迟系统的分散式自适应跟踪控制，通过神经网络具有随机输入到状态稳定的逆动力学
2. Adaptive tracking control for a class of stochastic switched systems with stochastic input-to-state stable inverse dynamics and input saturation [J] . Yao Liqiang, Zhang Weihai Systems and Control Letters . 2019,第期

机译：随机输入到状态稳定逆动力学和输入饱和的一类随机交换系统的自适应跟踪控制
3. Reduced-order K-filters based decentralized fuzzy adaptive control of stochastic large-scale nonlinear systems with stochastic input unmodeled dynamics [J] . Xiaonan Xia, Tianping Zhang Neurocomputing . 2018,第jana10期

机译：具有随机输入未建模动力学的随机大型非线性系统基于降阶K滤波器的分散模糊自适应控制
4. AdaPT: Zero-Shot Adaptive Policy Transfer for Stochastic Dynamical Systems [C] . J. Harrison, A. Garg, B. Ivanovic, International Symposium on Robotics Research . 2020

机译：适应：随机动力系统的零射自适应策略转移
5. OPTIMAL ADAPTIVE STOCHASTIC CONTROL OF LINEAR DYNAMICAL SYSTEMS WITH MULTIPLICATIVE AND ADDITIVE NOISE [D] . PANOSSIAN, HAGOP VARTEVAR 1981

机译：具乘加性噪声的线性动力学系统的最优自适应随机控制
6. Understanding the dynamics of the Seguro Popular de Salud policy implementation in Mexico from a complex adaptive systems perspective [O] . Gustavo Nigenda, Luz María González-Robledo, Clara Juárez-Ramírez, 2016

机译：从复杂的适应性系统角度了解墨西哥塞古罗大众德萨洛德（Seguro Popular de Salud）政策实施的动态
7. AdaPT: Zero-Shot Adaptive Policy Transfer for Stochastic Dynamical Systems [O] . James Harrison, Animesh Garg, Boris Ivanovic, 2019

机译：适应：随机动力系统的零射自适应策略转移
8. Adaptive and Optimal Control of Stochastic Dynamical Systems. [R] . Duncan, T. E., Pasik-Duncan, B. J. 2015

机译：随机动力系统的自适应与最优控制。

AdaPT: Zero-Shot Adaptive Policy Transfer for Stochastic Dynamical Systems

摘要

著录项

相似文献

相关主题

期刊订阅