首页> 美国卫生研究院文献>Frontiers in Neurorobotics >Cooperative and Competitive Reinforcement and Imitation Learning for a Mixture of Heterogeneous Learning Modules
【2h】

Cooperative and Competitive Reinforcement and Imitation Learning for a Mixture of Heterogeneous Learning Modules

机译:混合学习模式的合作和竞争性强化与模仿学习

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

This paper proposes Cooperative and competitive Reinforcement And Imitation Learning (CRAIL) for selecting an appropriate policy from a set of multiple heterogeneous modules and training all of them in parallel. Each learning module has its own network architecture and improves the policy based on an off-policy reinforcement learning algorithm and behavior cloning from samples collected by a behavior policy that is constructed by a combination of all the policies. Since the mixing weights are determined by the performance of the module, a better policy is automatically selected based on the learning progress. Experimental results on a benchmark control task show that CRAIL successfully achieves fast learning by allowing modules with complicated network structures to exploit task-relevant samples for training.
机译:本文提出了合作和竞争的强化与模仿学习(CRAIL),用于从一组多个异构模块中选择合适的策略,并同时对其进行培训。每个学习模块都有其自己的网络体系结构,并根据非策略强化学习算法和从行为策略收集的样本中的行为克隆来改进策略,该行为策略是由所有策略的组合构成的。由于混合权重取决于模块的性能,因此会根据学习进度自动选择更好的策略。在基准控制任务上的实验结果表明,CRAIL通过允许具有复杂网络结构的模块利用与任务相关的样本进行训练来成功实现了快速学习。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号