首页> 外文会议>International Workshop on Monitoring, Security, and Rescue Techniques in Multiagent Systems >Distributed Adaptive Control: Beyond Single-Instant, Discrete Control Variables
【24h】

Distributed Adaptive Control: Beyond Single-Instant, Discrete Control Variables

机译:分布式自适应控制:超越单速,离散控制变量

获取原文

摘要

In extensive form noncooperative game theory, at each instant t, each agent i sets its state x_i independently of the other agents, by sampling an associated distribution, q_i(x_i). The coupling between the agents arises in the joint evolution of those distributions. Distributed control problems can be cast the same way. In those problems the system designer sets aspects of the joint evolution of the distributions to try to optimize the goal for the overall system. Now information theory tells us what the separate q_i of the agents are most likely to be if the system were to have a particular expected value of the objective function G(x_1,x_2, ...). So one can view the job of the system designer as speeding an iterative process. Each step of that process starts with a specified value of E(G), and the convergence of the q_i to the most likely set of distributions consistent with that value. After this the target value for E_q(G) is lowered, and then the process repeats. Previous work has elaborated many schemes for implementing this process when the underlying variables x_i all have a finite number of possible values and G does not extend to multiple instants in time. That work also is based on a fixed mapping from agents to control devices, so that the the statistical independence of the agents' moves means independence of the device states. This paper also extends that work to relax all of these restrictions. This extends the applicability of that work to include continuous spaces and Reinforcement Learning. This paper also elaborates how some of that earlier work can be viewed as a first-principles justification of evolution-based search algorithms.
机译:在广泛的非自由度博弈论中,在每个瞬间t,每个代理通过采样关联的分布,q_i(x_i),我可以独立地设置其状态x_i。试剂之间的偶联在这些分布的关节演变中产生。分布式控制问题可以相同的方式投射。在这些问题中,系统设计人员在尝试优化整个系统的目标方面的联合演进的方面。现在信息理论告诉我们代理的单独Q_I最有可能是该系统具有目标函数G(X_1,X_2,...)的特定预期值。因此,可以将系统设计师的工作视为加快迭代过程。该过程的每个步骤以指定的e(g)的值开始,以及q_i的收敛到与该值一致的最可能的分布集。在此之后,e_q(g)的目标值降低,然后重复过程。以前的工作已经详细说明了许多用于在底层变量X_I所有具有有限数量的可能值并且g不扩展到多个瞬间时实现此过程的许多方案。该工作还基于从代理到控制设备的固定映射,使得代理的统计独立性移动意味着设备状态的独立性。本文还扩展了这项工作,可以放宽所有这些限制。这延长了该工作的适用性,包括连续空间和加强学习。本文还详细阐述了如何将一些早期的工作视为基于进化的搜索算法的第一原理理由。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号