首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Efficient Model-Based Deep Reinforcement Learning with Variational State Tabulation
【24h】

Efficient Model-Based Deep Reinforcement Learning with Variational State Tabulation

机译:变状态列表的基于模型的高效深度强化学习

获取原文
           

摘要

Modern reinforcement learning algorithms reach super-human performance on many board and video games, but they are sample inefficient, i.e. they typically require significantly more playing experience than humans to reach an equal performance level. To improve sample efficiency, an agent may build a model of the environment and use planning methods to update its policy. In this article we introduce Variational State Tabulation (VaST), which maps an environment with a high-dimensional state space (e.g. the space of visual inputs) to an abstract tabular model. Prioritized sweeping with small backups, a highly efficient planning method, can then be used to update state-action values. We show how VaST can rapidly learn to maximize reward in tasks like 3D navigation and efficiently adapt to sudden changes in rewards or transition probabilities.
机译:现代强化学习算法在许多棋盘游戏和视频游戏中都达到了超人的性能,但是它们的样本效率低下,即,要达到相同的性能水平,它们通常需要比人多得多的游戏体验。为了提高样本效率,代理可以构建环境模型并使用计划方法来更新其策略。在本文中,我们介绍了变分状态制表(VaST),该方法将具有高维状态空间(例如可视输入空间)的环境映射到抽象的表格模型。然后,可以使用具有小规模备份的优先级扫描(一种高效的计划方法)来更新状态操作值。我们展示了VaST如何在3D导航等任务中快速学习以最大限度地提高奖励,并有效地适应奖励或过渡概率的突然变化。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号