Learning-Rate Adjusting Q-Learning for Prisoner's Dilemma Games

机译：学习率调整囚犯困境游戏的Q-Learning

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Many multiagent Q-learning algorithms have been proposed to date, and most of them aim to converge to a Nash equilibrium, which is not desirable in games like the Prisoner's Dilemma (PD). In the previous paper, the author proposed the utility-based Q-learning for PD, which used utilities as rewards in order to maintain mutual cooperation once it had occurred. However, since the agent's action depends on the relation of Q-values the agent has, the mutual cooperation can be maintained by adjusting the learning rate of Q-learning. Thus, in this paper, we deal with the learning rate directly and introduce a new Q-learning method called the learning-rate adjusting Q-learning, or LRA-Q.

机译：迄今为止已经提出了许多多元Q学习算法，大多数旨在融合到纳什均衡，这在像囚犯的困境（PD）这样的游戏中是不可取的。在上文中，作者提出了基于实用的Q-Learning for PD，其中利用公用事业作为奖励，以便在发生后维持相互合作。然而，由于代理的行动取决于代理的Q值的关系，因此可以通过调整Q-Learning的学习率来维护相互合作。因此，在本文中，我们直接处理学习率，并引入一种新的Q学习方法，称为学习率调整Q-Learning，或LRA-Q。

著录项

来源
《IEEE/WIC/ACM Joint International Conference on Web Intelligence and Intelligent Agent Technology》|2008年||共4页
会议地点
作者
Moriyama Koichi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词
Game Theory; Multiagent System; Prisoner's Dilemma; Reinforcement Learning;

机译：博弈论;多层系统;囚犯的困境;加强学习;

相似文献

外文文献
中文文献
专利

1. Utility based Q-learning to facilitate cooperation in Prisoner's Dilemma games [J] . Koichi Moriyama Web Intelligence and Agent Systems . 2009,第3期

机译：基于实用程序的Q学习，以促进囚徒困境游戏中的合作
2. The link weight adjustment considering historical strategy promotes the cooperation in the spatial prisoner's dilemma game [J] . Liu Chengwei, Wang Juan, Li Xiaopeng, Physica, A. Statistical mechanics and its applications . 2020,第1期

机译：考虑历史战略的联系重量调整促进了空间囚犯困境游戏中的合作
3. Strategy imitation behavior driven influence adjustment promotes cooperation in spatial prisoner's dilemma game [J] . Wang Zhen, Zhang Geng-shun, Ding Hong, Physica, A. Statistical mechanics and its applications . 2019,第期

机译：战略模仿行为驱动的影响调整促进空间囚犯困境游戏中的合作
4. Learning-Rate Adjusting Q-Learning for Prisoner's Dilemma Games [C] . Moriyama Koichi IEEE/WIC/ACM Joint International Conference on Web Intelligence and Intelligent Agent Technology . 2008

机译：学习率调整囚犯困境游戏的Q-Learning
5. A game of changing the rules of the game: Avoiding a prisoners' dilemma by hostage commitment. [D] . Hwang, Gwang-Syung George. 1999

机译：改变游戏规则的游戏：通过人质承诺避免囚犯的困境。
6. Human cooperation in social dilemmas: comparing the Snowdrift game with the Prisoners Dilemma [O] . Rolf Kümmerli, Caroline Colliard, Nicolas Fiechter, 2007

机译：社会困境中的人类合作：将雪堆游戏与囚徒困境进行比较
7. Human cooperation in social dilemmas: comparing the Snowdrift game with the Prisoner's Dilemma. [O] . Kümmerli, R., Colliard, C., Fiechter, N., 2007

机译：人类在社会困境中的合作：将“雪堆”游戏与“囚徒困境”进行比较。

Learning-Rate Adjusting Q-Learning for Prisoner's Dilemma Games

摘要

著录项

相似文献

相关主题

期刊订阅