首页> 外文期刊>電子情報通信学会技術研究報告. ニュ-ロコンピュ-ティング. Neurocomputing >A multi-agent reinforcement learning method with learning of other agents for competitive game
【24h】

A multi-agent reinforcement learning method with learning of other agents for competitive game

机译:一种多智能体增强学习方法,具有竞争游戏的其他代理商

获取原文
获取原文并翻译 | 示例
           

摘要

This report proposes a reinforcement learning (RL) method based on the Actor-Critic architecture, which can be applied to partially-observable multi-agent competitive games. As an example, we consider a card game "Hearts". The RL then becomes a partially-observable Markov decision process (POMDP). In our method, a single Hearts game is divided into three stages, and three actors are prepared so that one of them plays and learns separately in each stage. In particular, the actor for the middle stage plays so as to enlarge the expected temporal-difference error, which is calculated using the evaluation function approximated by the critic and the estimated state transition. Computer experiments with heuristic players show that our RL method works well.
机译:本报告提出了一种基于演员 - 评论家架构的强化学习(RL)方法,可以应用于部分可观察的多贸易竞争游戏。 作为一个例子,我们考虑一下纸牌游戏“心”。 然后,RL成为部分可观察的马尔可夫决策过程(POMDP)。 在我们的方法中,单一的心灵游戏分为三个阶段,并准备了三个演员,以便其中一个在每个阶段中都在播放和学习。 特别地,中间阶段的演员播放以扩大预期的时间差错误差,该误差使用估计估计和估计的状态转换近似的评估函数来计算。 启发式玩家的计算机实验表明我们的RL方法很好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号