首页> 外文会议>International Conference on Applications of Digital Information and Web Technologies >Distributing Rewards by Strategic Knowledge based on Nash-Q Learning
【24h】

Distributing Rewards by Strategic Knowledge based on Nash-Q Learning

机译:基于NASH-Q学习的战略知识分发奖励

获取原文

摘要

In this investigation, we examine collaboration approach to reward distribution in repeated general-sum stochastic games by multiple game players in terms of position and rewards. There have been several investigation of reward distribution discussed so far, and reinforcement has been considered useful since no knowledge is needed in advanced and better decision can be extracted while learning. Among others, Q-learning has been paid much attention under single agent environment. However, under multi-agent environment, we don't have sharp targets to this problem, what is the most optimal principle? In this work, we discuss how to distribute reward thoroughly by considering as general stochastic games based on theory of games. That is, we introduce Nash-Q approach which combines Nash equilibrium with Q-learning. We show the new approach provides us with new strategic solution. We discuss some experiments of rather complicated games (game of life) to see the usefulness of the approach.
机译:在这项调查中,我们在位置和奖励方面,研究了多个游戏者在多个游戏玩家中奖励分配的合作方法。到目前为止讨论了奖励分布的几次调查,而且被认为是有用的,因为在学习时无法提取先进的和更好的决定所需的知识。在其他人中,Q-Learning在单一代理环境下得到了很多关注。但是,在多代理环境下,我们没有尖锐的目标对此问题,最佳原则是什么?在这项工作中,我们讨论如何通过考虑基于游戏理论的一般随机游戏来彻底分发奖励。也就是说,我们介绍了NASH-Q方法,将纳什均衡与Q-Leyaly结合起来。我们展示了新方法为我们提供了新的战略解决方案。我们讨论了一些相当复杂的游戏(生活游戏)的实验,以了解该方法的有用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号