首页> 外文学位 >Multiagent reactive plan application learning in dynamic environments.
【24h】

Multiagent reactive plan application learning in dynamic environments.

机译:动态环境中的Multiagent反应计划应用程序学习。

获取原文
获取原文并翻译 | 示例

摘要

This dissertation studies how we can build a multiagent system that can learn to execute high-level strategies in complex, dynamic, and uncertain domains. We assume that agents do not explicitly communicate and operate autonomously only with their local view of the world. Designing multiagent systems for real-world applications is challenging because of the prohibitively large state and action spaces. Dynamic changes in an environment require reactive responses, and the complexity and the uncertainty inherent in real settings require that individual agents keep their commitment to achieving common goals despite adversities. Therefore, a balance between reaction and reasoning is necessary to accomplish goals in real-world environments. Most work in multiagent systems approaches this problem using bottom-up methodologies. However, bottom-up methodologies are severely limited since they cannot learn alternative strategies, which are essential for dealing with highly dynamic, complex, and uncertain environments where convergence of single-strategy behavior is virtually impossible to obtain. Our methodology is knowledge-based and combines top-down and bottom-up approaches to problem solving in order to take advantage of the strengths of both. We use symbolic plans that define the requirements for what individual agents in a collaborative group need to do to achieve multi-step goals that span through time, but, initially, they do not specify how to implement these goals in each given situation. During training, agents acquire application knowledge using case-based learning, and, using this training knowledge, agents apply plans in realistic settings. During application, they use a naive form of reinforcement learning to allow them to make increasingly better decisions about which specific implementation to select for each situation. Experimentally, we show that, as the complexity of plans increases, the version of our system with naive reinforcement learning performs increasingly better than the version that retrieves and applies unreinforced training knowledge and the version that reacts to dynamic changes using search.
机译:本文研究了如何构建一个可以学习在复杂,动态和不确定领域中执行高级策略的多主体系统。我们假设代理人不会仅凭他们的本地世界来明确地进行交流和自主运作。由于过大的状态和动作空间,为实际应用设计多代理系统具有挑战性。环境中的动态变化需要做出反应,而实际环境中固有的复杂性和不确定性要求各个代理人即使在逆境中也要坚持实现共同目标的承诺。因此,必须在反应和推理之间取得平衡,才能在现实环境中实现目标。多代理系统中的大多数工作都是使用自下而上的方法来解决此问题的。但是,自下而上的方法受到严重限制,因为它们无法学习替代策略,这对于处理高度动态,复杂且不确定的环境(从根本上无法实现单一策略行为的融合)至关重要。我们的方法是基于知识的,并结合了自上而下和自下而上的方法来解决问题,以利用两者的优势。我们使用象征性计划来定义协作组中的各个业务代表需要做什么才能实现跨时间的多步骤目标,但最初,它们没有指定如何在每种给定情况下实现这些目标。在培训期间,座席使用基于案例的学习来获取应用程序知识,并且,使用此培训知识,座席可以在现实的环境中应用计划。在应用过程中,他们使用幼稚的强化学习形式,以使他们能够针对每种情况选择哪种具体实施方案做出更好的决策。实验表明,随着计划复杂性的提高,具有天真的强化学习功能的系统版本的性能要优于检索和应用未经强化训练知识的版本以及通过搜索对动态变化做出反应的版本。

著录项

  • 作者

    Sevay, Huseyin.;

  • 作者单位

    The University of Kansas.;

  • 授予单位 The University of Kansas.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2004
  • 页码 228 p.
  • 总页数 228
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号