...
【24h】

Multi-criteria expertness based cooperative method for SARSA and eligibility trace algorithms

机译:基于多准则专业知识的SARSA协作方法和资格跟踪算法

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Temporal difference and eligibility traces are of the most common approaches to solve reinforcement learning problems. However, except in the case of Q-learning, there are no studies about using these two approaches in a cooperative multi-agent learning setting. This paper addresses this shortcoming by using temporal difference and eligibility traces as the core learning method in multi-criteria expertness based cooperative learning (MCE). The experiments, performed on a sample maze world, show the results of an empirical study on temporal difference and eligibility trace methods in a MCE based cooperative learning setting.
机译:时间差异和资格跟踪是解决强化学习问题的最常用方法。但是,除了Q学习之外,没有关于在协作多主体学习环境中使用这两种方法的研究。本文通过使用时差和合格性痕迹作为基于多准则专业知识的合作学习(MCE)的核心学习方法来解决此缺点。在示例迷宫世界上进行的实验显示了在基于MCE的合作学习环境中对时差和资格跟踪方法进行实证研究的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号