LEARNING FROM ACTIONS NOT TAKEN IN MULTIAGENT SYSTEMS

Tumer K; Khani N

首页> 外文期刊>Advances in complex systems >LEARNING FROM ACTIONS NOT TAKEN IN MULTIAGENT SYSTEMS

【24h】

LEARNING FROM ACTIONS NOT TAKEN IN MULTIAGENT SYSTEMS

机译：从多主体系统中未采取的行动中学习

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In large cooperative multiagent systems, coordinating the actions of the agents is critical to the overall system achieving its intended goal. Even when the agents aim to cooperate, ensuring that the agent actions lead to good system level behavior becomes increasingly difficult as systems become larger. One of the fundamental difficulties in such multiagent systems is the slow learning process where an agent not only needs to learn how to behave in a complex environment, but also needs to account for the actions of other learning agents. In this paper, we present a multiagent learning approach that significantly improves the learning speed in multiagent systems by allowing an agent to update its estimate of the rewards (e.g. value function in reinforcement learning) for all its available actions, not just the action that was taken. This approach is based on an agent estimating the counterfactual reward it would have received had it taken a particular action. Our results show that the rewards on such "actions not taken" are beneficial early in training, particularly when only particular "key" actions are used. We then present results where agent teams are leveraged to estimate those rewards. Finally, we show that the improved learning speed is critical in dynamic environments where fast learning is critical to tracking the underlying processes.

机译：在大型协作多代理系统中，协调代理的动作对于整个系统实现其预期目标至关重要。即使当代理程序旨在合作时，随着系统的变大，确保代理程序行为导致良好的系统级行为也变得越来越困难。这种多主体系统中的基本困难之一是学习过程缓慢，其中主体不仅需要学习如何在复杂的环境中进行行为，还需要考虑其他学习主体的行为。在本文中，我们提出了一种多主体学习方法，该方法通过允许主体更新其对所有可用动作的奖励估算（例如强化学习中的价值函数），而不仅是以前的动作，从而显着提高了多主体系统的学习速度。采取。这种方法是基于代理估算的，如果采取了特定的行动，它将获得反事实的奖励。我们的结果表明，对此类“未采取的行动”的奖励在训练初期是有益的，尤其是在仅使用特定的“关键”行动时。然后，我们在代理团队被用来估计这些奖励的情况下给出结果。最后，我们证明了提高的学习速度在动态环境中至关重要，在动态环境中，快速学习对于跟踪基础过程至关重要。

著录项

来源
《Advances in complex systems》 |2009年第5期|共19页
作者
Tumer K; Khani N;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自动化系统理论;
关键词
Multiagent learning; counterfactual reward; difference reward;

机译：多主体学习;反事实奖励;差异奖励;
入库时间 2022-08-18 09:52:54

相似文献

外文文献
中文文献
专利

1. Consensus Tracking of Fractional-Order Multiagent Systems via Fractional-Order Iterative Learning Control [J] . Shuaishuai Lv, Mian Pan, Xungen Li, Complexity . 2019,第1期

机译：通过分数级迭代学习控制共识跟踪分数级多算系统
2. Consensus Tracking by Iterative Learning Control for Linear Heterogeneous Multiagent Systems Based on Fractional-Power Error Signals [J] . Yu-Juan Luo, Cheng-Lin Liu, Guang-Ye Liu Algorithms . 2019,第9期

机译：基于分数次幂误差信号的线性异构多主体系统迭代学习控制共识跟踪
3. Performance Index Based Observer-Type Iterative Learning Control for Consensus Tracking of Uncertain Nonlinear Fractional-Order Multiagent Systems [J] . Liming Wang, Guoshan Zhang Complexity . 2019,第1期

机译：基于绩效指数的观察者型迭代学习控制，用于不确定非线性分数多算系统的共识跟踪
4. A virtual laboratory for multiagent systems: Joining efficacy, learning analytics and student satisfaction [C] . Luis Castillo International Symposium on Computers in Education . 2016

机译：用于多主体系统的虚拟实验室：结合功效，学习分析和学生满意度
5. Explaining Collective Behavior with Dynamical Systems: Spatial Gradient Sensing in Eukaryotic Chemotaxis and Learning Dynamics in Multiagent Reinforcement Learning [D] . Shams, Daniel . 2019

机译：用动力系统解释集体行为：多核化趋化性的空间梯度传感和多核强化学习中的学习动态
6. Dynamically analyzing cell interactions in biological environments using multiagent social learning framework [O] . Chengwei Zhang, Xiaohong Li, Shuxin Li, 2017

机译：使用多主体社会学习框架动态分析生物环境中的细胞相互作用
7. LEARNING FROM ACTIONS NOT TAKEN IN MULTIAGENT SYSTEMS [O] . Kagan Tumer, Newsha Khani 2009

机译：从多代理系统中未采取的行动中学习
8. CLEANing the Reward: Counterfactual Actions to Remove Exploratory Action Noise in Multiagent Learning (Extended Abstract). [R] . Parker, C. H., Taylor, M. E., Tumer, K., 2014

机译：清理奖励：在多智能体学习中消除探索性行为噪声的反事实行动（扩展摘要）。

LEARNING FROM ACTIONS NOT TAKEN IN MULTIAGENT SYSTEMS

摘要

著录项

相似文献

相关主题

期刊订阅