Accelerating decentralized reinforcement learning of complex individual behaviors

Leottau David L.; Lobos-Tsunekawa Kenzo; Jaramillo Francisco; Ruiz-del-Solar Javier

首页> 外文期刊>Engineering Applications of Artificial Intelligence >Accelerating decentralized reinforcement learning of complex individual behaviors

【24h】

Accelerating decentralized reinforcement learning of complex individual behaviors

机译：加速分散式强化学习复杂的个人行为

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Many Reinforcement Learning (RL) real-world applications have multi-dimensional action spaces which suffer from the combinatorial explosion of complexity. Then, it may turn infeasible to implement Centralized RL (CRL) systems due to the exponential increasing of dimensionality in both the state space and the action space, and the large number of training trials. In order to address this, this paper proposes to deal with these issues by using Decentralized Reinforcement Learning (DRL) to alleviate the effects of the curse of dimensionality on the action space, and by transferring knowledge to reduce the training episodes so that asymptotic converge can be achieved. Three DRL schemes are compared: DRL with independent learners and no prior-coordination (DRLInd); DRL accelerated-coordinated by using the Control Sharing (DRL+CoSh) Knowledge Transfer approach; and a proposed DRL scheme using the CoSh-based variant Nearby Action Sharing to include a measure of the uncertainty into the CoSh procedure (DRL+NeASh). These three schemes are analyzed through an extensive experimental study and validated through two complex real-world problems, namely the inwalk-kicking and the ball-dribbling behaviors, both performed with humanoid biped robots. Obtained results show (empirically): (i) the effectiveness of DRL systems which even without prior-coordination are able to achieve asymptotic convergence throughout indirect coordination; (ii) that by using the proposed knowledge transfer methods, it is possible to reduce the training episodes and to coordinate the DRL process; and (iii) obtained learning times are between 36% and 62% faster than the DRL-Ind schemes in the case studies.

机译：现实世界中许多强化学习（RL）应用程序都具有多维动作空间，这些动作空间受复杂性组合爆炸的影响。然后，由于状态空间和动作空间中维数的指数增长以及大量的训练试验，实现集中式RL（CRL）系统可能变得不可行。为了解决这个问题，本文建议通过使用分散强化学习（DRL）来减轻维数诅咒对动作空间的影响，并通过转移知识来减少训练次数，从而使渐近收敛可以解决这些问题。取得成就。比较了三种DRL方案：具有独立学习者且没有事先协调的DRL； DRL通过使用控制共享（DRL + CoSh）知识转移方法来加速协调；以及建议的DRL方案，该方案使用基于CoSh的变体附近动作共享将不确定性的度量纳入CoSh程序（DRL + NeASh）。通过广泛的实验研究对这三种方案进行了分析，并通过两个复杂的现实世界问题进行了验证，即人行脚踏机器人执行的步入踢球和运球行为。获得的结果表明（凭经验）：（i）即使没有事先协调，DRL系统的有效性也能够在整个间接协调中实现渐近收敛；（ii）通过使用建议的知识转移方法，可以减少培训次数并协调DRL过程；（iii）在案例研究中，获得的学习时间比DRL-Ind方案快36％至62％。

著录项

来源
《Engineering Applications of Artificial Intelligence》 |2019年第10期|243-253|共11页
作者
Leottau David L.; Lobos-Tsunekawa Kenzo; Jaramillo Francisco; Ruiz-del-Solar Javier;
展开▼
作者单位

Univ Chile Adv Min Technol Ctr Dept Elect Engn Ave Tupper 2007 Santiago Chile;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Decentralized reinforcement learning; Multi-agent systems; Distributed control; Autonomous robots; Knowledge transfer; Distributed artificial intelligence;

机译：分散强化学习;多代理系统;分布式控制;自主机器人知识传输;分布式人工智能;

相似文献

外文文献
中文文献
专利

1. Accelerating decentralized reinforcement learning of complex individual behaviors [J] . Leottau David L., Lobos-Tsunekawa Kenzo, Jaramillo Francisco, Engineering Applications of Artificial Intelligence . 2019,第Octa期

机译：加速分散式强化学习复杂的个人行为
2. Decentralized Reinforcement Learning of Robot Behaviors [J] . David L. Leottau, Javier Ruiz-del-Solar, Robert Babuška Artificial intelligence . 2018,第MARa期

机译：机器人行为的分散强化学习
3. Social behavior study under pervasive social networking based on decentralized deep reinforcement learning [J] . Zhang Yue, Song Bin, Zhang Peng Journal of network and computer applications . 2017,第MAY期

机译：基于分散式深度强化学习的普遍社交网络下的社会行为研究
4. An Accelerated Approach to Decentralized Reinforcement Learning of the Ball-Dribbling Behavior [C] . D. Leonardo Leottau, Javier Ruiz-del-Solar AAAI Conference on Artificial Intelligence . 2015

机译：一种加速措施对球刺行为的分散加固学习方法
5. Decentralized Deep Reinforcement Learning for Network Level Traffic Signal Control [D] . Guo, Jin . 2020

机译：网络级交通信号控制分散的深度增强学习
6. Individual Differences in Reinforcement Learning: Behavioral Electrophysiological and Neuroimaging Correlates [O] . Diane L. Santesso, Daniel G. Dillon, Jeffrey L. Birk, -1

机译：强化学习中的个体差异：行为电生理和神经影像相关
7. Individual differences in reinforcement learning: Behavioral, electrophysiological, and neuroimaging correlates [O] . Diane L. Santesso, Daniel G. Dillon, Jeffrey L. Birk, 2008

机译：增强学习的个体差异：行为，电生理和神经影像相关性

Accelerating decentralized reinforcement learning of complex individual behaviors

摘要

著录项

相似文献

相关主题

期刊订阅