Multiagent learning in the presence of agents with limitations.

机译：在存在局限性的代理的情况下进行多代理学习。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Learning to act in a multiagent environment is a challenging problem. Optimal behavior for one agent depends upon the behavior of the other agents, which are learning as well. Multiagent environments are therefore non-stationary, violating the traditional assumption underlying single-agent learning. In addition, agents in complex tasks may have limitations, such as physical constraints or designer-imposed approximations of the task that make learning tractable. Limitations prevent agents from acting optimally, which complicates the already challenging problem. A learning agent must effectively compensate for its own limitations while exploiting the limitations of the other agents. My thesis research focuses on these two challenges, namely multiagent learning and limitations, and includes four main contributions.; First, the thesis introduces the novel concepts of a variable learning rate and the WoLF (Win or Learn Fast) principle to account for other learning agents. The WoLF principle is capable of making rational learning algorithms converge to optimal policies, and by doing so achieves two properties, rationality and convergence, which had not been achieved by previous techniques. The converging effect of WoLF is proven for a class of matrix games, and demonstrated empirically for a wide-range of stochastic games.; Second, the thesis contributes an analysis of the effect of limitations on the game-theoretic concept of Nash equilibria. The existence of equilibria is important if multiagent learning techniques, which often depend on the concept, are to be applied to realistic problems where limitations are unavoidable. The thesis introduces a general model for the effect of limitations on agent behavior, which is used to analyze the resulting impact on equilibria. The thesis shows that equilibria do exist for a few restricted classes of games and limitations, but even well-behaved limitations do not preserve the existence of equilibria, in general.; Third, the thesis introduces GraWoLF, a general-purpose, scalable, multiagent learning algorithm. GraWoLF combines policy gradient learning techniques with the WoLF variable learning rate. The effectiveness of the learning algorithm is demonstrated in both a card game with an intractably large state space, and an adversarial robot task. These two tasks are complex and agent limitations are prevalent in both.; Fourth, the thesis describes the CMDragons robot soccer team strategy for adapting to an unknown opponent. (Abstract shortened by UMI.)

机译：学会在多主体环境中行动是一个具有挑战性的问题。一个代理的最佳行为取决于其他正在学习的代理的行为。因此，多主体环境是不稳定的，违反了基于单主体学习的传统假设。此外，复杂任务中的主体可能具有局限性，例如物理约束或设计者对任务的逼近，使学习变得容易。局限性使代理无法发挥最佳作用，这使本已具有挑战性的问题变得更加复杂。学习代理必须在利用其他代理的局限性的同时，有效地弥补自身的局限性。本文的研究集中在这两个挑战上，即多主体学习和局限性，并包括四个主要方面。首先，论文介绍了可变学习率的新概念和WoLF（获胜或快速学习）原理来说明其他学习主体。 WoLF原理能够使理性学习算法收敛于最优策略，并且这样做可以实现两个属性，即理性和收敛，这是先前技术无法实现的。 WoLF的收敛效果已在一类矩阵博弈中得到证明，并在广泛的随机博弈中得到了经验证明。其次，本文有助于分析限制对纳什均衡博弈论概念的影响。如果多智能体学习技术（通常取决于概念）要用于不可避免的局限性现实问题，那么平衡的存在就很重要。本文介绍了一个局限性对行为主体行为影响的通用模型，该模型用于分析由此产生的对均衡性的影响。论文表明，均衡对某些局限性的博弈和局限性确实存在，但即使行为良好的局限性也不能总体上保持均衡性的存在。第三，论文介绍了GraWoLF，一种通用的，可扩展的多主体学习算法。 GraWoLF将策略梯度学习技术与WoLF可变学习率结合在一起。在具有难以置信的大状态空间的纸牌游戏和对抗性机器人任务中都证明了学习算法的有效性。这两个任务很复杂，并且代理限制都普遍存在。第四，论文描述了CMDragons机器人足球队适应未知对手的策略。（摘要由UMI缩短。）

著录项

作者
Bowling, Michael.;
展开▼
作者单位

Carnegie Mellon University.;

展开▼
授予单位 Carnegie Mellon University.;
学科 Artificial Intelligence.; Computer Science.
学位 Ph.D.
年度 2003
页码 172 p.
总页数 172
原文格式 PDF
正文语种 eng
中图分类人工智能理论;自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Multiagent learning in the presence of memory-bounded agents [J] . Doran Chakraborty, Peter Stone Autonomous agents and multi-agent systems . 2014,第2期

机译：存在内存受限代理的多代理学习
2. Multiagent Reinforcement Social Learning toward Coordination in Cooperative Multiagent Systems [J] . JIANYE HAO, HO-FUNG LEUNG, ZHONG MING ACM transactions on autonomous and adaptive systems . 2015,第4期

机译：协作式多智能体系统中的多智能体增强社会学习以促进协调
3. Multiagent Learning of Coordination in Loosely Coupled Multiagent Systems [J] . Yu Chao, Zhang Minjie, Ren Fenghui, Cybernetics, IEEE Transactions on . 2015,第12期

机译：松耦合多智能体系统中协调的多智能体学习
4. Decentralized learning-based planning for multiagent missions in the presence of actuator failures [C] . Ure N.Kemal, Chowdhary Girish, Chen Yu Fan, International Conference on Unmanned Aircraft Systems . 2013

机译：在执行器故障的情况下，基于分散学习的多主体任务计划
5. A game theoretic framework for temporal and agent abstractions in multiagent learning. [D] . Clement, Danielle M. 2016

机译：在多主体学习中用于时间和主体抽象的游戏理论框架。
6. Dynamically analyzing cell interactions in biological environments using multiagent social learning framework [O] . Chengwei Zhang, Xiaohong Li, Shuxin Li, 2017

机译：使用多主体社会学习框架动态分析生物环境中的细胞相互作用
7. Sample efficient multiagent learning in the presence of Markovian agents [O] . Chakraborty, Doran 2014

机译：在马尔可夫智能体存在下的有效样本多智能体学习
8. Multiagent Learning in the Presence of Agents with Limitations [R] . Bowling, M. 2003

机译：具有局限性的agent存在下的多agent学习

Multiagent learning in the presence of agents with limitations.

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅