首页> 外文OA文献 >Best Response Bayesian Reinforcement Learning for Multiagent Systems with State Uncertainty
【2h】

Best Response Bayesian Reinforcement Learning for Multiagent Systems with State Uncertainty

机译:具有状态不确定性的多主体系统的最佳响应贝叶斯强化学习

摘要

It is often assumed that agents in multiagent systems with state uncertainty have full knowledge of the model of dy- namics and sensors, but in many cases this is not feasible. A more realistic assumption is that agents must learn about the environment and other agents while acting. Bayesian methods for reinforcement learning are promising for this type of learning because they allow model uncertainty to be considered explicitly and offer a principled way of dealing with the exploration/exploitation tradeoff. In this paper, we propose a Bayesian RL framework for best response learn- ing in which an agent has uncertainty over the environment and the policies of the other agents. This is a very general model that can incorporate different assumptions about the form of other policies. We seek to maximize performance and learn the appropriate models while acting in an online fashion by using sample-based planning built from power- ful Monte-Carlo tree search methods. We discuss the theo- retical properties of this approach and experimental results show that the learning approaches can significantly increase value when compared to initial models and policies.
机译:通常认为具有状态不确定性的多主体系统中的主体对动力学和传感器的模型有充分的了解,但是在许多情况下这是不可行的。一个更现实的假设是,代理人在行动时必须了解环境和其他代理人。贝叶斯强化学习的方法对于这种类型的学习很有希望,因为它们允许明确地考虑模型不确定性,并提供处理勘探/开采权衡的原则方法。在本文中,我们提出了一种用于最佳响应学习的贝叶斯RL框架,其中一个代理对环境和其他代理的政策具有不确定性。这是一个非常通用的模型,可以包含有关其他政策形式的不同假设。我们力求通过使用基于样本的强大的蒙特卡洛树搜索方法构建的基于计划的计划,以在线方式行事时最大化性能并学习适当的模型。我们讨论了这种方法的理论性质,实验结果表明,与初始模型和策略相比,学习方法可以显着增加价值。

著录项

  • 作者

    Oliehoek FA; Amato C;

  • 作者单位
  • 年度 2014
  • 总页数
  • 原文格式 PDF
  • 正文语种 en
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号