首页> 外文学位 >Multi-agent learning: The agent-centric agenda.
【24h】

Multi-agent learning: The agent-centric agenda.

机译:多主体学习:以主体为中心的议程。

获取原文
获取原文并翻译 | 示例

摘要

Recent years have seen a rapidly growing interest in multi-agent systems, and in particular in learning algorithms for such systems. The problem of learning in a multi-agent system is qualitatively more difficult than the single agent learning situation. The optimal policy is now fundamentally dependent on the strategies employed by the other agents and the environment is dynamic, changing character over time as the agents adapt to one another's behavior.; There are a number of different research agendas one could pursue under the name of learning in multi-agent systems. I will focus on an agent-centric agenda, in which one asks how an agent should act in order to maximize its rewards in the presence of other agents who may also be learning (using the same or other learning algorithms). After discussing existing work from both artificial intelligence and game theory, I define a new formal criterion to guide the development of algorithms in this setting. This new criterion takes in as a parameter a class of opponents and requires the agent to learn to behave optimally against members of that class while providing a security value guarantee against all opponents and yielding high reward when paired with other identical algorithms.; Using this criterion as a guideline, I'll describe a modular approach for achieving effective agent-centric learning. I demonstrate the power of this approach in the environment of general-sum known repeated games by providing several specific instantiations for both stationary opponents and opponents with bounded recall. These algorithms are shown both to satisfy the formal criterion and to achieve superior experimental results when compared with existing algorithms in comprehensive computer testing. I then discuss possible extensions to these algorithms for situations in which the agent has limited access to the payoff structure for the game. These extensions include a novel algorithm that is applicable over a wide variety of informational assumptions and shows significant improvements over existing approaches from the literature. I conclude with an analysis of the value of teaching in achieving high empirical payoffs and some suggestions for future work.
机译:近年来,人们对多智能体系统,尤其是对此类系统的学习算法的兴趣迅速增长。从本质上讲,在多智能体系统中的学习问题比单智能体学习的情况更加困难。现在,最佳策略从根本上取决于其他代理所采用的策略,并且环境是动态的,随着代理适应彼此的行为,其特性会随着时间而改变。以多智能体系统中的学习为名,可以追求许多不同的研究议程。我将集中讨论以代理为中心的议程,其中一个问题是,在其他可能正在学习(使用相同或其他学习算法)的其他代理在场的情况下,代理应如何行动以最大化其回报。在讨论了来自人工智能和博弈论的现有工作之后,我定义了一个新的正式标准来指导在这种情况下算法的开发。该新标准将一组反对者作为参数,并要求代理学习针对该类成员的最佳行为,同时为所有反对者提供安全值保证,并与其他相同算法搭配使用时产生高额回报。以此准则为指导,我将描述一种用于实现有效的以代理为中心的学习的模块化方法。我通过为固定的对手和有限回想的对手提供几种特定的实例化,证明了这种方法在一般已知的重复游戏环境中的威力。与在综合计算机测试中的现有算法相比,这些算法既满足形式标准,又获得了优异的实验结果。然后,我讨论了在代理人有限度地访问游戏收益结构的情况下,这些算法的可能扩展。这些扩展包括一种新颖的算法,该算法适用于各种信息假设,并且相对于文献中的现有方法显示出显着的改进。最后,我分析了教学在实现高经验回报方面的价值,并对未来的工作提出了一些建议。

著录项

  • 作者

    Powers, Rob.;

  • 作者单位

    Stanford University.;

  • 授予单位 Stanford University.;
  • 学科 Artificial Intelligence.; Computer Science.
  • 学位 Ph.D.
  • 年度 2006
  • 页码 129 p.
  • 总页数 129
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 人工智能理论;自动化技术、计算机技术;
  • 关键词

  • 入库时间 2022-08-17 11:40:38

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号