首页> 美国卫生研究院文献>other >Learning Multirobot Hose Transportation and Deployment by Distributed Round-Robin Q-Learning

【2h】

Learning Multirobot Hose Transportation and Deployment by Distributed Round-Robin Q-Learning

机译：通过分布式轮循Q学习学习多机器人软管的运输和部署

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Multi-Agent Reinforcement Learning (MARL) algorithms face two main difficulties: the curse of dimensionality, and environment non-stationarity due to the independent learning processes carried out by the agents concurrently. In this paper we formalize and prove the convergence of a Distributed Round Robin Q-learning (D-RR-QL) algorithm for cooperative systems. The computational complexity of this algorithm increases linearly with the number of agents. Moreover, it eliminates environment non sta tionarity by carrying a round-robin scheduling of the action selection and execution. That this learning scheme allows the implementation of Modular State-Action Vetoes (MSAV) in cooperative multi-agent systems, which speeds up learning convergence in over-constrained systems by vetoing state-action pairs which lead to undesired termination states (UTS) in the relevant state-action subspace. Each agent’s local state-action value function learning is an independent process, including the MSAV policies. Coordination of locally optimal policies to obtain the global optimal joint policy is achieved by a greedy selection procedure using message passing. We show that D-RR-QL improves over state-of-the-art approaches, such as Distributed Q-Learning, Team Q-Learning and Coordinated Reinforcement Learning in a paradigmatic Linked Multi-Component Robotic System (L-MCRS) control problem: the hose transportation task. L-MCRS are over-constrained systems with many UTS induced by the interaction of the passive linking element and the active mobile robots.

机译：多主体强化学习（MARL）算法面临两个主要困难：维度的诅咒和由于代理同时执行的独立学习过程而导致的环境不稳定。在本文中，我们形式化并证明了用于协作系统的分布式Round Robin Q学习（D-RR-QL）算法的收敛性。该算法的计算复杂度随着代理的数量线性增加。此外，它通过进行动作选择和执行的循环调度来消除环境的不稳定。该学习方案允许在协作式多智能体系统中实施模块化状态行动否决权（MSAV），从而通过否决导致行动中不希望的终止状态（UTS）的状态行动对来加快过度约束系统中的学习收敛。相关的状态动作子空间。每个代理的本地状态行为价值功能学习是一个独立的过程，包括MSAV策略。通过使用消息传递的贪婪选择过程来实现局部最优策略的协调以获得全局最优联合策略。我们展示了D-RR-QL改进了最新方法，例如范式链接多组件机器人系统（L-MCRS）控制问题中的分布式Q学习，团队Q学习和协同强化学习：软管运输任务。 L-MCRS是过度约束的系统，由于被动链接元素和主动移动机器人的相互作用而导致许多UTS。

著录项

期刊名称 other
作者
Borja Fernandez-Gauna; Ismael Etxeberria-Agiriano; Manuel Graña;
展开▼
作者单位

展开▼
年(卷),期 -1(10),7
年度 -1
页码 e0127129
总页数 27
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. LEARNING HOSE TRANSPORT CONTROL WITH Q-LEARNING [J] . Borja Fernandez-Gauna, Jose Manuel Lopez-Guede, Ekaitz Zulueta, Neural Network World . 2010,第7期

机译：通过Q学习进行软管学习控制
2. Realization of an Adaptive Memetic Algorithm Using Differential Evolution and Q-Learning: A Case Study in Multirobot Path Planning [J] . Rakshit P., Konar A., Bhowmik P., Systems, Man, and Cybernetics: Systems, IEEE Transactions on . 2013,第4期

机译：基于差分进化和Q学习的自适应模因算法的实现：以多机器人路径规划为例
3. Acquisition of Efficient Transportation Knowledge by Q-learning for Multiple Autonomous AGVs and Their Transportation Simulation [J] . Michiko Watanabe, Masashi Furukawa, Masahiro Kinoshita 精密工学会誌 . 2001,第10期

机译：通过Q学习获得多辆自动AGV的高效运输知识及其运输模拟
4. On Distributed Cooperative Control for the Manipulation of a Hose by a Multirobot System [C] . Jose Manuel Lopez-Guede, Manuel Grana, Ekaitz Zulueta Hybrid Artificial Intelligent Systems . 2008

机译：多机器人系统对软管的分布式协同控制
5. Building an artificial cerebellum using a system of distributed q-learning agents. [D] . Soto Santibanez, Miguel Angel. 2010

机译：使用分布式q学习剂系统构建人工小脑。
6. Constrained Deep Q-Learning Gradually Approaching Ordinary Q-Learning [O] . Shota Ohnishi, Eiji Uchibe, Yotaro Yamaguchi, 2019

机译：受约束的深度Q学习逐渐接近普通Q学习
7. Learning Multirobot Hose Transportation and Deployment by Distributed Round-Robin Q-Learning. [O] . Borja Fernandez-Gauna, Ismael Etxeberria-Agiriano, Manuel Graña 2015

机译：通过分布式循环Q学习学习多机器人软管运输和部署。

Learning Multirobot Hose Transportation and Deployment by Distributed Round-Robin Q-Learning

摘要

著录项

相似文献

相关主题

期刊订阅