Bayesian reinforcement learning in markovian and non-markovian tasks

机译：马氏和非马氏任务中的贝叶斯强化学习

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

We present a Bayesian reinforcement learning model with a working memory module which can solve some non-Markovian decision processes. The model is tested, and compared against SARSA (lambda), on a standard working-memory task from the psychology literature. Our method uses the Kalman temporal difference framework, And its extension to stochastic state transitions, to give posterior distributions over state-action values. This framework provides a natural mechanism for using reward information to update more than the current state-action pair, and thus negates the use of eligibility traces. Furthermore, the existence of full posterior distributions allows the use of Thompson sampling for action selection, which in turn removes the need to choose an appropriately parameterised action-selection method.

机译：我们提出了带有工作记忆模块的贝叶斯强化学习模型，该模型可以解决一些非马尔可夫决策过程。根据心理学文献中的标准工作记忆任务，对该模型进行了测试，并与SARSA（lambda）进行了比较。我们的方法使用Kalman时差框架，并将其扩展到随机状态转移，以给出状态作用值的后验分布。该框架提供了一种自然的机制，用于使用奖励信息来更新比当前状态-动作对更多的信息，从而否定了资格跟踪的使用。此外，完全后验分布的存在允许使用汤普森采样进行动作选择，从而消除了选择适当参数化的动作选择方法的需要。

著录项

作者
Ez-Zizi Adnane; Farrell Simon; Leslie David;
展开▼
作者单位

展开▼
年度 2016
总页数
原文格式 PDF
正文语种 eng
中图分类

相似文献

外文文献
中文文献
专利

1. Totally model-free actor-critic recurrent neural-network reinforcement learning in non-Markovian domains [J] . Mizutani Eiji, Dreyfus Stuart Annals of Operations Research . 2017,第1期

机译：非马尔可夫域中的完全无模型的actor-critic递归神经网络强化学习
2. Machine learning for pricing American options in high-dimensional Markovian and non-Markovian models [J] . Quantitative finance . 2020,第4期

机译：机器学习在高维马尔维亚和非马尔维亚型号中定价美国选项
3. Analysis of a non-Markovian queueing model: Bayesian statistics and MCMC methods [J] . Hayette Braham, Louiza Berdjoudj, Mohamed Boualem, Monte Carlo Methods and Applications . 2019,第2期

机译：非马尔可夫排队模型的分析：贝叶斯统计和MCMC方法
4. Bayesian Reinforcement Learning in Markovian and non-Markovian Tasks [C] . Adnane Ez-Zizi, Simon Farrell, David Leslie IEEE Symposium Series on Computational Intelligence . 2015

机译：马尔可夫和非马尔可夫任务中的贝叶斯强化学习
5. An echo state model of non-Markovian reinforcement learning. [D] . Bush, Keith A. 2008

机译：非马尔可夫强化学习的回声状态模型。
6. Human and Machine Learning in Non-Markovian Decision Making [O] . Aaron Michael Clarke, Johannes Friedrich, Elisa M. Tartaglia, -1

机译：非马尔可夫决策中的人机学习
7. Bayesian Reinforcement Learning in Markovian and non-Markovian Tasks [O] . Ez-Zizi, Adnane, Farrell, Simon, Leslie, David Stuart 2015

机译：马尔可夫和非马尔可夫任务中的贝叶斯强化学习

Bayesian reinforcement learning in markovian and non-markovian tasks

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅