Learning-Rate Adjusting Q-Learning for Prisoner's Dilemma Games

机译：囚徒困境游戏的学习率调整Q学习

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Many multiagent Q-learning algorithms have been proposed to date, and most of them aim to converge to a Nash equilibrium, which is not desirable in games like the Prisoner's Dilemma (PD). In the previous paper, the author proposed the utility-based Q-learning for PD, which used utilities as rewards in order to maintain mutual cooperation once it had occurred. However, since the agent's action depends on the relation of Q-values the agent has, the mutual cooperation can be maintained by adjusting the learning rate of Q-learning. Thus, in this paper, we deal with the learning rate directly and introduce a new Q-learning method called the learning-rate adjusting Q-learning, or LRA-Q.

机译：迄今为止，已经提出了许多多主体Q学习算法，并且它们中的大多数旨在收敛到Nash平衡，这在诸如囚徒困境（PD）之类的游戏中是不希望的。在先前的论文中，作者提出了基于效用的PD学习，该学习将效用作为奖励，以便在发生合作时保持相互合作。但是，由于代理的动作取决于代理所具有的Q值的关系，因此可以通过调整Q学习的学习率来维持相互合作。因此，在本文中，我们直接处理学习率，并介绍一种称为学习率调整Q学习或LRA-Q的新Q学习方法。

著录项

来源
《Web Intelligence and Intelligent Agent Technology, WI-IAT, 2008 IEEE/WIC/ACM International Conference on》||P.322-325|共4页
会议地点
作者
Moriyama Koichi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类工业技术;
关键词
Game Theory; Multiagent System; Prisoner's Dilemma; Reinforcement Learning;

机译：博弈论;多主体系统;囚徒困境;强化学习;

相似文献

外文文献
中文文献
专利

1. Utility based Q-learning to facilitate cooperation in Prisoner's Dilemma games [J] . Koichi Moriyama Web Intelligence and Agent Systems . 2009,第3期

机译：基于实用程序的Q学习，以促进囚徒困境游戏中的合作
2. The link weight adjustment considering historical strategy promotes the cooperation in the spatial prisoner's dilemma game [J] . Liu Chengwei, Wang Juan, Li Xiaopeng, Physica, A. Statistical mechanics and its applications . 2020,第1期

机译：考虑历史战略的联系重量调整促进了空间囚犯困境游戏中的合作
3. Strategy imitation behavior driven influence adjustment promotes cooperation in spatial prisoner's dilemma game [J] . Wang Zhen, Zhang Geng-shun, Ding Hong, Physica, A. Statistical mechanics and its applications . 2019,第期

机译：战略模仿行为驱动的影响调整促进空间囚犯困境游戏中的合作
4. Learning-Rate Adjusting Q-Learning for Prisoner's Dilemma Games [C] . Moriyama Koichi IEEE/WIC/ACM Joint International Conference on Web Intelligence and Intelligent Agent Technology . 2008

机译：学习率调整囚犯困境游戏的Q-Learning
5. A game of changing the rules of the game: Avoiding a prisoners' dilemma by hostage commitment. [D] . Hwang, Gwang-Syung George. 1999

机译：改变游戏规则的游戏：通过人质承诺避免囚犯的困境。
6. Human cooperation in social dilemmas: comparing the Snowdrift game with the Prisoners Dilemma [O] . Rolf Kümmerli, Caroline Colliard, Nicolas Fiechter, 2007

机译：社会困境中的人类合作：将雪堆游戏与囚徒困境进行比较
7. Human cooperation in social dilemmas: comparing the Snowdrift game with the Prisoner's Dilemma. [O] . Kümmerli, R., Colliard, C., Fiechter, N., 2007

机译：人类在社会困境中的合作：将“雪堆”游戏与“囚徒困境”进行比较。

Learning-Rate Adjusting Q-Learning for Prisoner's Dilemma Games

摘要

著录项

相似文献

相关主题

期刊订阅