Exploiting Additive Structure in Factored MDPs for Reinforcement Learning

机译：用于加固学习的因子MDP中的利用添加剂结构

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

SDYNA is a framework able to address large, discrete and stochastic reinforcement learning problems. It incrementally learns a FMDP representing the problem to solve while using FMDP planning techniques to build an efficient policy. SPITI, an instantiation of SDYNA, uses a planning method based on dynamic programming which cannot exploit the additive structure of a FMDP. In this paper, we present two new instantiations of SDYNA, namely ULP and UNATLP, using a linear programming based planning method that can exploit the additive structure of a FMDP and address problems out of reach of SPITI.

机译：Sdyna是一个能够解决大型，离散和随机强化学习问题的框架。它逐步了解一个FMDP，表示使用FMDP计划技术构建有效策略的问题。 Spiti是SDYNA的实例化，使用基于动态编程的规划方法，该方法无法利用FMDP的添加结构。在本文中，我们使用基于线性编程的规划方法介绍了SDYNA，即ULP和UNATLP的两个新实例，可以利用FMDP的添加剂结构和孢子座的覆盖范围。

著录项

来源
《European Workshop on Reinforcement Learning》|2008年||共12页
会议地点
作者
Thomas Degris; Olivier Sigaud; Pierre-Henri Wuillemin;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词
入库时间 2022-08-20 21:22:03

相似文献

外文文献
中文文献
专利

1. Feature Reinforcement Learning: Part II. Structured MDPs [J] . Marcus Hutter Journal of Artificial General Intelligence . 2021,第1期

机译：特色钢筋学习：第二部分。结构化MDPS.
2. Problem Dependent Reinforcement Learning Bounds Which Can Identify Bandit Structure in MDPs [J] . Andrea Zanette, Emma Brunskill JMLR: Workshop and Conference Proceedings . 2018,第12期

机译：可以识别MDP中的强盗结构的与问题相关的强化学习范围
3. Reinforcement learning with limited reinforcement: Using Bayes risk for active learning in POMDPs [J] . Finale Doshi-Velez, Joelle Pineau, Nicholas Roy Artificial intelligence . 2012,第期

机译：通过有限的强化进行强化学习：使用贝叶斯风险在POMDP中进行主动学习
4. Exploiting Additive Structure in Factored MDPs for Reinforcement Learning [C] . Thomas Degris, Olivier Sigaud, Pierre-Henri Wuillemin Recent advances in reinforcement learning . 2008

机译：在因式MDP中利用加性结构进行强化学习
5. Model-Based Reinforcement Learning for Cooperative Multi-Agent Planning: Exploiting Hierarchies, Bias, and Temporal Sampling [D] . Ma, Aaron. 2020

机译：基于模型的合作多智能经纪人规划的强化学习：利用层次结构，偏见和时间采样
6. Reinforcement Learning with Limited Reinforcement: Using Bayes Risk for Active Learning in POMDPs [O] . Finale Doshi, Joelle Pineau, Nicholas Roy -1

机译：通过有限的强化进行强化学习：使用Bayes风险在POMDP中进行主动学习
7. Exploiting Additive Structure in Factored MDPs for Reinforcement Learning [O] . Degris Thomas, Sigaud Olivier, Wuillemin Pierre-Henri 2008

机译：在因式MDP中利用加性结构进行强化学习

Exploiting Additive Structure in Factored MDPs for Reinforcement Learning

摘要

著录项

相似文献

相关主题

期刊订阅