Constructive Policy: Reinforcement Learning Approach for Connected Multi-Agent Systems

机译：建设性政策：互联多智能体系统的强化学习方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Policy based reinforcement learning methods are widely used for multi-agent systems to learn optimal actions given any state; with partial or even no model representation. However multi-agent systems with complex structures (curse of dimensionality) or with high constraints (like bio-inspired (a) snake or serpentine robots) show limited performance in such environments due to sparse-reward nature of environment and no fully observable model representation. In this paper we present a constructive learning and planning scheme that reduces the complexity of high-diemensional agent model by decomposing it into identical, connected and scaled down multiagent structure and then apply learning framework in layers of local and global ranking. Our layered hierarchy method also decomposes the final goal into multiple sub-tasks and a global task (final goal) that is bias-induced function of local sub-tasks. Local layer deals with learning `reusable' local policy for a local agent to achieve a sub-task optimally; that local policy can also be reused by other identical local agents. Furthermore, global layer learns a policy to apply right combination of local policies that are parameterized over entire connected structure of local agents to achieve the global task by collaborative construction of local agents. After learning local policies and while learning global policy, the framework generates sub-tasks for each local agent, and accepts local agents' intrinsic rewards as positive bias towards maximum global reward based of optimal sub-tasks assignments. The advantage of proposed approach includes better exploration due to decomposition of dimensions, and reusability of learning paradigm over extended dimension spaces. We apply the constructive policy method to serpentine robot with hyper-redundant degrees of freedom (DOF), for achieving optimal control and we also outline connection to hierarchical apprenticeship learning methods which can be seen as layered learning framework for complex control tasks.

机译：基于策略的强化学习方法被广泛用于多主体系统，以在给定任何状态的情况下学习最佳动作。部分或什至没有模型表示。但是，由于环境的稀疏奖励性质，并且没有完全可观察到的模型表示形式，具有复杂结构（维数诅咒）或具有较高约束条件（例如受生物启发的（a）蛇或蛇形机器人）的多主体系统在此类环境中的性能有限。。在本文中，我们提出了一个建设性的学习和计划方案，该方案通过将高维智能体模型分解为相同的，相互连接的和按比例缩小的多智能体结构，然后将学习框架应用于本地和全球排名层次，从而降低了其复杂性。我们的分层方法也将最终目标分解为多个子任务和一个全局任务（最终目标），该全局任务是局部子任务的偏见引起的功能。本地层处理学习“可重用”的本地策略，以使本地代理最佳地完成子任务。本地策略也可以被其他相同的本地代理重用。此外，全局层学习一种策略，以应用在本地代理的整个连接结构上参数化的本地策略的正确组合，以通过协作构建本地代理来实现全局任务。在学习了本地策略并学习了全局策略之后，该框架会为每个本地代理生成子任务，并接受本地代理的内在奖励，作为基于最佳子任务分配的最大全局奖励的积极偏见。提议的方法的优点包括由于维分解而导致的更好的探索，以及在扩展维空间上学习范式的可重用性。我们将构造策略方法应用于具有超冗余自由度（DOF）的蛇形机器人，以实现最佳控制，并且还概述了与分层学徒学习方法的联系，该方法可以看作是复杂控制任务的分层学习框架。

著录项

来源
《IEEE International Conference on Automation Science and Engineering》|2019年|257-262|共6页
会议地点
作者
Sayyed Jaffar Ali Raza; Mingjie Lin;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Robots; Task analysis; Aerospace electronics; Multi-agent systems; Biological system modeling; Reinforcement learning; Computational modeling;

机译：机器人;任务分析;航空航天电子;多智能体系统;生物系统建模;强化学习;计算建模;

相似文献

外文文献
中文文献
专利

1. A reinforcement learning approach for developing routing policies in multi-agent production scheduling [J] . Yi-Chi Wang, John M. Usher The International Journal of Advanced Manufacturing Technology . 2007,第3a4期

机译：在多主体生产调度中制定路由策略的强化学习方法
2. A reinforcement learning approach for developing routing policies in multi-agent production scheduling [J] . Yi-Chi Wang, John M. Usher The International Journal of Advanced Manufacturing Technology . 2007,第3a4期

机译：在多主体生产调度中制定路由策略的强化学习方法
3. A multi-agent reinforcement learning approach to obtaining dynamic control policies for stochastic lot scheduling problem [J] . Paternina-Arboleda CD, Das TK Simulation modelling practice and theory: International journal of the Federation of European Simulation Societies . 2005,第5期

机译：一种用于随机批次调度问题的动态控制策略的多主体强化学习方法
4. Constructive Policy: Reinforcement Learning Approach for Connected Multi-Agent Systems [C] . Sayyed Jaffar Ali Raza, Mingjie Lin IEEE International Conference on Automation Science and Engineering . 2019

机译：建设性政策：连接多智能体系的加固学习方法
5. Multi-Agent Reinforcement Learning as Applied to Autonomous Systems [D] . Ding, Guohui. 2021

机译：适用于自主系统的多功能钢筋学习
6. On-Demand Channel Bonding in Heterogeneous WLANs: A Multi-Agent Deep Reinforcement Learning Approach [O] . Hang Qi, Hao Huang, Zhiqun Hu, 2020

机译：异构WLAN中的按需信道绑定：多代理深度强化学习方法
7. Local Policy-sharing Systems for Multi-agent Reinforcement Learning-An Approach from the Learning Classifier System [O] . Hiroyasu INOUE, Katsunori SHIMOHARA, Osamu KATAI 2006

机译：用于多智能经纪增强学习的地方策略共享系统 - 来自学习分类器系统的方法

Constructive Policy: Reinforcement Learning Approach for Connected Multi-Agent Systems

摘要

著录项

相似文献

相关主题

期刊订阅