Deep Reinforcement Learning Based Intelligent Decision Making for Two-player Sequential Game with Uncertain Irrational Player

机译：基于深度强化学习的不确定性两人序列游戏的智能决策

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, two player sequential game with an unknown non-stationary irrational player is investigated for cooperative autonomous robots decision making applications. In practice, the irrationality of agent can seriously degrade the effectiveness of decision making especially for distributed cooperative tasks with applications to multi-robot systems. Specifically, The irrationality can be caused by the cooperation agent’s mechanical failure or sensor flaw. To handle this issue, a novel dynamic evaluation system, which includes two important parameters, i.e. cooperation index and competitive flag, is designed to efficiently quantify the player’s level of cooperation or competition firstly. Then, the continuous deep Q network space is proposed to predict the action value with respect to a continuous cooperation index. Inspired from the framework of "Friend or Foe" algorithm, a novel hybrid online multi-agent deep reinforcement learning algorithm is proposed. The designed algorithm can evaluate the cooperator’s cooperative level as well as maximize the total payoff by learning in a continuous deep Q network space. Eventually, numerical simulation and experimental tests are provided to demonstrate the effectiveness of the designed algorithm.

机译：本文研究了具有未知非平稳非理性玩家的两人顺序博弈，以用于协作式自主机器人决策应用。实际上，代理的不合理性会严重降低决策的有效性，尤其是对于分布式协作任务以及应用于多机器人系统的决策。具体来说，不合理性可能是由合作代理的机械故障或传感器缺陷引起的。为了解决这个问题，设计了一种新颖的动态评估系统，该系统包括两个重要参数，即合作指数和竞争标志，以首先有效地量化参与者的合作或竞争水平。然后，提出了连续的深层Q网络空间，以预测关于连续合作指标的作用值。在“朋友或敌人”算法框架的启发下，提出了一种新颖的混合在线多主体深度强化学习算法。设计的算法可以评估合作者的合作水平，并通过在连续的深层Q网络空间中学习来最大化总收益。最终，通过数值模拟和实验测试证明了所设计算法的有效性。

著录项

来源
《IEEE Symposium Series on Computational Intelligence》|2019年|9-15|共7页
会议地点
作者
Zejian Zhou; Hao Xu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Games; Robots; Machine learning; Indexes; Decision making; Heuristic algorithms; Learning (artificial intelligence);

机译：游戏;机器人;机器学习;索引;决策;启发式算法;学习（人工智能）;

相似文献

外文文献
中文文献
专利

1. Spike-based Decision Learning of Nash Equilibria in Two-Player Games [J] . Johannes Friedrich, Walter Senn PLoS Computational Biology . 2012,第9期

机译：两人游戏中基于纳什均衡的基于峰值的决策学习
2. Online concurrent reinforcement learning algorithm to solve two-player zero-sum games for partially unknown nonlinear continuous-time systems [J] . Yasini Sholeh, Karimpour Ali, Sistani Mohammad-Bagher Naghibi, International Journal of Adaptive Control and Signal Processing . 2015,第4期

机译：在线并发强化学习算法，用于求解部分未知的非线性连续时间系统的两人零和游戏
3. LL_2, a simple reinforcement learning scheme for two-player zero-sum Markov games [J] . Benoit Frenay, Marco Saerens Neurocomputing . 2009,第7a9期

机译：LL_2，一种用于两人零和马尔可夫游戏的简单强化学习方案
4. Deep Reinforcement Learning Based Intelligent Decision Making for Two-player Sequential Game with Uncertain Irrational Player [C] . Zejian Zhou, Hao Xu IEEE Symposium Series on Computational Intelligence . 2019

机译：基于深度加强学习智能决策与不确定的非理性球员的双人顺序游戏
5. Deception in two-player zero-sum stochastic games: Theory and application to warfare games. [D] . Singh, Rajdeep. 2006

机译：两人零和随机游戏中的欺骗：理论和在战争游戏中的应用。
6. Spike-based Decision Learning of Nash Equilibria in Two-Player Games [O] . Johannes Friedrich, Walter Senn 2012

机译：两人游戏中基于纳什均衡的基于峰值的决策学习
7. Spike-based decision learning of nash equilibria in two-player games [O] . Friedrich Johannes, Senn Walter 2012

机译：两人游戏中基于纳什均衡的基于峰值的决策学习

Deep Reinforcement Learning Based Intelligent Decision Making for Two-player Sequential Game with Uncertain Irrational Player

摘要

著录项

相似文献

相关主题

期刊订阅