首页> 外文会议>European Workshop on Reinforcement Learning >Evaluation of Batch-Mode Reinforcement Learning Methods for Solving DEC-MDPs with Changing Action Sets
【24h】

Evaluation of Batch-Mode Reinforcement Learning Methods for Solving DEC-MDPs with Changing Action Sets

机译:用改变作用集求解DEC-MDP的批量模式增强学习方法评价

获取原文

摘要

DEC-MDPs with changing action sets and partially ordered transition dependencies have recently been suggested as a sub-class of general DEC-MDPs that features provably lower complexity. In this paper, we investigate the usability of a coordinated batch-mode reinforcement learning algorithm for this class of distributed problems. Our agents acquire their local policies independent of the other agents by repeated interaction with the DEC-MDP and concurrent evolvement of their policies, where the learning approach employed builds upon a specialized variant of a neural fitted Q iteration algorithm, enhanced for use in multi-agent settings. We applied our learning approach to various scheduling benchmark problems and obtained encouraging results that show that problems of current standards of difficulty can very well approximately, and in some cases optimally be solved.
机译:最近提出了具有更改动作集和部分有序转换依赖性的DEC-MDP,作为常规DEC-MDP的子类,其特征在于复杂性更低。在本文中,我们调查了对该类分布式问题的协调批量增强学习算法的可用性。我们的代理商通过重复与DEC-MDP的反复互动以及其政策的并发演变,从其他代理商独立于其他代理商获取当地政策,其中采用学习方法在神经拟合Q迭代算法的专用变体上建立,增强用于多个 - 代理设置。我们将学习方法应用于各种调度基准问题,并获得了令人鼓舞的结果,表明当前难度标准的问题非常好,在某些情况下最佳地解决。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号