...
首页> 外文期刊>Robotics, IEEE Transactions on >Real-World Reinforcement Learning via Multifidelity Simulators
【24h】

Real-World Reinforcement Learning via Multifidelity Simulators

机译:通过多保真模拟器进行真实世界的强化学习

获取原文
获取原文并翻译 | 示例
           

摘要

Reinforcement learning (RL) can be a tool for designing policies and controllers for robotic systems. However, the cost of real-world samples remains prohibitive as many RL algorithms require a large number of samples before learning useful policies. Simulators are one way to decrease the number of required real-world samples, but imperfect models make deciding when and how to trust samples from a simulator difficult. We present a framework for efficient RL in a scenario where multiple simulators of a target task are available, each with varying levels of fidelity. The framework is designed to limit the number of samples used in each successively higher-fidelity/cost simulator by allowing a learning agent to choose to run trajectories at the lowest level simulator that will still provide it with useful information. Theoretical proofs of the framework's sample complexity are given and empirical results are demonstrated on a remote-controlled car with multiple simulators. The approach enables RL algorithms to find near-optimal policies in a physical robot domain with fewer expensive real-world samples than previous transfer approaches or learning without simulators.
机译:强化学习(RL)可以是为机器人系统设计策略和控制器的工具。但是,由于许多RL算法在学习有用的策略之前需要大量样本,因此现实世界中样本的成本仍然很高。模拟器是减少所需的实际样本数量的一种方法,但是不完善的模型使决定何时以及如何信任模拟器中的样本变得困难。我们提供了一个目标任务的多个仿真器可用且每个逼真度都不同的情况下有效RL的框架。该框架旨在通过允许学习代理选择在最低级别的模拟器上运行轨迹(仍将为其提供有用信息)来限制每个相继的更高保真度/成本模拟器中使用的样本数量。给出了框架示例复杂性的理论证明,并在带有多个模拟器的遥控汽车上展示了实验结果。与以前的传输方法或没有模拟器的学习方法相比,该方法使RL算法能够在物理机器人领域中找到接近最优的策略,并具有更少的昂贵实际样本。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号