首页> 外文会议>European Workshop on Reinforcement Learning >Exploiting Additive Structure in Factored MDPs for Reinforcement Learning
【24h】

Exploiting Additive Structure in Factored MDPs for Reinforcement Learning

机译:用于加固学习的因子MDP中的利用添加剂结构

获取原文

摘要

SDYNA is a framework able to address large, discrete and stochastic reinforcement learning problems. It incrementally learns a FMDP representing the problem to solve while using FMDP planning techniques to build an efficient policy. SPITI, an instantiation of SDYNA, uses a planning method based on dynamic programming which cannot exploit the additive structure of a FMDP. In this paper, we present two new instantiations of SDYNA, namely ULP and UNATLP, using a linear programming based planning method that can exploit the additive structure of a FMDP and address problems out of reach of SPITI.
机译:Sdyna是一个能够解决大型,离散和随机强化学习问题的框架。它逐步了解一个FMDP,表示使用FMDP计划技术构建有效策略的问题。 Spiti是SDYNA的实例化,使用基于动态编程的规划方法,该方法无法利用FMDP的添加结构。在本文中,我们使用基于线性编程的规划方法介绍了SDYNA,即ULP和UNATLP的两个新实例,可以利用FMDP的添加剂结构和孢子座的覆盖范围。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号