首页> 外文会议>European Conference on Artificial Intelligence >Leader-Follower MDP Models with Factored State Space and Many Followers - Followers Abstraction, Structured Dynamics and State Aggregation

【24h】

Leader-Follower MDP Models with Factored State Space and Many Followers - Followers Abstraction, Structured Dynamics and State Aggregation

机译：带有因素的国家空间和许多追随者的领导者MDP模型 - 追随者抽象，结构性动态和状态聚集

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The Leader-Follower Markov Decision Processes (LF-MDP) framework extends both Markov Decision Processes (MDP) and Stochastic Games. It provides a model where an agent (the leader) can influence a set of other agents (the followers) which are playing a stochastic game, by modifying their immediate reward functions, but not their dynamics. It is assumed that all agents act selfishly and try to optimize their own long-term expected reward. Finding equilibrium strategies in a LF-MDP is hard, especially when the joint state space of followers is factored. In this case, it takes exponential time in the number of followers. Our theoretical contribution is threefold. First, we analyze a natural assumption (substitutability of followers), which holds in many applications. Under this assumption, we show that a LF-MDP can be solved exactly in polynomial time, when deterministic equilibria exist for all games encountered in the LF-MDP. Second, we show that an additional assumption of sparsity of the problem dynamics allows us to decrease the exponent of the polynomial. Finally, we present a state-aggregation approximation, which decreases further the exponent and allows us to approximately solve large problems. We empirically validate the LF-MDP approach on a class of realistic animal disease control problems. For problems of this class, we find deterministic equilibria for all games. Using our first two results, we are able to solve the exact LF-MDP problem with 15 followers (compared to 6 or 7 in the original model). Using state-aggregation, problems with up to 50 followers can be solved approximately. The approximation quality is evaluated by comparison with the exact approach on problems with 12 and 15 followers.

机译：领导者 - 追随者马尔可夫决策过程（LF-MDP）框架扩展了马尔可夫决策过程（MDP）和随机游戏。它提供了一种模型，代理人（领导者）可以影响正在播放随机游戏的一组其他代理（追随者），通过修改其直接奖励功能，而不是它们的动态。假设所有代理人都自私自己任并尝试优化自己的长期预期奖励。在LF-MDP中发现均衡策略很难，特别是当考虑追随者的联合状态空间时。在这种情况下，追随者的数量需要指数时间。我们的理论贡献是三倍。首先，我们分析了一种自然假设（追随者的可替代性），其在许多应用中占据了许多应用。在这种假设下，我们表明，当LF-MDP中的所有游戏存在确定性均衡时，可以在多项式时间内完全解决LF-MDP。其次，我们表明，问题动态的余胎的额外假设使我们能够减少多项式的指数。最后，我们提出了一个状态聚合近似，这进一步减少了指数，并允许我们大致解决大问题。我们经验验证了一类现实动物疾病控制问题的LF-MDP方法。对于本课程的问题，我们找到了所有游戏的确定性均衡。使用我们的前两种结果，我们能够用15个粉丝解决完全的LF-MDP问题（与原始模型中的6或7相比）。使用状态聚合，大约可以解决最多50个粉丝的问题。通过比较与12和15粉丝问题的确切方法进行评估，评估近似质量。

著录项

来源
《European Conference on Artificial Intelligence》|2016年|912p|共9页
会议地点
作者
Regis Sabbadin; Anne-France Viet;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词

相似文献

外文文献
中文文献
专利

1. Dynamics modeling and robotic-assist, leader-follower control of tractor convoys [J] . Joshua T. Cook, Laura E. Ray, James H. Lever Journal of terramechanics . 2018,第FEBa期

机译：拖拉机车队的动力学建模和机器人辅助，跟随跟随控制
2. IMPLICIT LEADERSHIP AND FOLLOWERSHIP THEORIES: DYNAMIC STRUCTURES FOR LEADERSHIP PERCEPTIONS, MEMORY, AND LEADER-FOLLOWER PROCESSES [J] . Sara J. Shondrick, Robert G. Lord International review of industrial and organizational psychology . 2010,第Null期

机译：隐式领导理论和跟随者理论：领导者知觉，记忆和跟随者过程的动态结构
3. RBF Neural Network Sliding Mode Consensus of Multiagent Systems with Unknown Dynamical Model of Leader-follower Agents [J] . Sharafian Amin, Bagheri Vahid, Zhang Weidong International Journal of Control, Automation, and Systems . 2018,第2期

机译：RBF具有未知动力学模型的多验系统的RBF神经网络滑动模式共识
4. Leader-Follower MDP Models with Factored State Space and Many Followers - Followers Abstraction, Structured Dynamics and State Aggregation [C] . Regis Sabbadin, Anne-France Viet European Conference on Artificial Intelligence . 2016

机译：带有因素的国家空间和许多追随者的领导者MDP模型 - 追随者抽象，结构性动态和状态聚集
5. Follower-centered leadership: An investigation of leader behavior, leader power, follower competency, and follower job performance in leader-follower relationships. [D] . Yoho, Steven K. 1994

机译：以跟随者为中心的领导力：对领导者与跟随者关系中的领导者行为，领导者权力，跟随者能力和跟随者工作绩效的调查。
6. The Influence of Leader-Follower Cognitive Style Similarity on Followers’ Organizational Citizenship Behaviors [O] . Steven J. Armstrong, Meng Qi 2020

机译：领导者追随者认知风格相似性对追随者组织公民行为的影响
7. A model of relative translation and rotation in leader-follower spacecraft formations [O] . Raymond Kristiansen, Esten I. Grøtli, Per J. Nicklasson, 2007

机译：领导者 - 跟随者航天器编队中的相对平移和旋转模型
8. Information Structures in Nash and Leader-Follower Strategies [R] . Salman, M. A. 1981

机译：纳什和领导者 - 追随者策略中的信息结构

Leader-Follower MDP Models with Factored State Space and Many Followers - Followers Abstraction, Structured Dynamics and State Aggregation

摘要

著录项

相似文献

相关主题

期刊订阅