Accelerating Imitation Learning with Predictive Models

Ching-An Cheng; Xinyan Yan; Evangelos Theodorou; Byron Boots

首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Accelerating Imitation Learning with Predictive Models

【24h】

Accelerating Imitation Learning with Predictive Models

机译：通过预测模型加速模仿学习

获取原文

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Sample efficiency is critical in solving real-world reinforcement learning problems where agent-environment interactions can be costly. Imitation learning from expert advice has proved to be an effective strategy for reducing the number of interactions required to train a policy. Online imitation learning, which interleaves policy evaluation and policy optimization, is a particularly effective technique with provable performance guarantees. In this work, we seek to further accelerate the convergence rate of online imitation learning, thereby making it more sample efficient. We propose two model-based algorithms inspired by Follow-the-Leader (FTL) with prediction: MoBIL-VI based on solving variational inequalities and MoBIL-Prox based on stochastic first-order updates. These two methods leverage a model to predict future gradients to speed up policy learning. When the model oracle is learned online, these algorithms can provably accelerate the best known convergence rate up to an order. Our algorithms can be viewed as a generalization of stochastic Mirror-Prox (Juditsky et al., 2011), and admit a simple constructive FTL-style analysis of performance.

机译：样本效率对于解决代理环境互动可能成本高昂的真实加固学习问题至关重要。从专家建议的模仿学习已经证明是减少培训政策所需的互动次数的有效策略。在线仿制学习，交织策略评估和政策优化，是一种特别有效的技术，具有可提供的性能保证。在这项工作中，我们寻求进一步加速在线模仿学习的收敛速度，从而使其更高的样本。我们提出了通过跟随 - 领导者（FTL）启发的基于模型的算法：Mobil-VI基于解决随机第一阶更新的变分不等式和Mobil-Prox。这两种方法利用模型来预测未来的渐变来加速政策学习。当模型Oracle在线学习时，这些算法可以可证明最佳已知的收敛速度达到订单。我们的算法可以被视为随机镜像 - PROx的概括（Quitsky等，2011），并承认性能简单的建设性FTL风格分析。

著录项

来源
《JMLR: Workshop and Conference Proceedings》 |2018年第2010期|共10页
作者
Ching-An Cheng; Xinyan Yan; Evangelos Theodorou; Byron Boots;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Supervised Imitation Learning of Finite-Set Model Predictive Control Systems for Power Electronics [J] . Novak Mateja, Dragicevic Tomislav IEEE Transactions on Industrial Electronics . 2021,第2期

机译：电力电子有限型模型预测控制系统的监督模仿学习
2. A neural model of cortico-cerebellar interactions during attentive imitation and predictive learning of sequential handwriting movements. [J] . Grossberg S, Paine RW Neural Networks: The Official Journal of the International Neural Network Society . 2000,第8a9期

机译：细心模仿和顺序笔迹运动的预测学习过程中皮层-小脑相互作用的神经模型。
3. Accelerating Reinforcement Learning through Implicit Imitation [J] . Boutilier C., Price B. The Journal of Artificial Intelligence Research . 2003,第10期

机译：通过内隐模仿加速强化学习
4. Predictive Models for Robot Ego-Noise Learning and Imitation [C] . Antonio Pico Villalpando, Guido Schillaci, Verena V. Hafner Joint IEEE International Conference on Development and Learning and Epigenetic Robotics . 2018

机译：机器人自我噪声学习与模仿的预测模型
5. Learning by imitation and exploration: Bayesian models and applications in humanoid robotics. [D] . Grimes, David B. 2007

机译：通过模仿和探索学习：类人机器人中的贝叶斯模型及其应用。
6. How Imitation Learning and Human Factors Can Be Combined in a Model Predictive Control Algorithm for Adaptive Motion Planning and Control [O] . Milad Karimshoushtari, Carlo Novara, Fabio Tango 2021

机译：如何在适应性运动规划和控制的模型预测控制算法中组合如何组合仿制学习和人为因素
7. How Imitation Learning and Human Factors Can Be Combined in a Model Predictive Control Algorithm for Adaptive Motion Planning and Control [O] . Milad Karimshoushtari, Carlo Novara, Fabio Tango 2021

机译：如何在适应性运动规划和控制的模型预测控制算法中组合如何组合仿制学习和人为因素

Accelerating Imitation Learning with Predictive Models

摘要

著录项

相似文献

相关主题

期刊订阅