首页> 外文会议>Australasian joint conference on artificial intelligence >Designing Curriculum for Deep Reinforcement Learning in StarCraft Ⅱ
【24h】

Designing Curriculum for Deep Reinforcement Learning in StarCraft Ⅱ

机译:临时争制深增强学习课程Ⅱ

获取原文

摘要

Reinforcement learning (RL) has proven successful in games, but suffers from long training times when compared to other forms of machine learning. Curriculum learning, an optimisation technique that improves a model's ability to learn by presenting training samples in a meaningful order, known as curricula, could offer a solution. Curricula are usually designed manually, due to limitations involved with automating curricula generation. However, as there is a lack of research into effective design of curricula, researchers often rely on intuition and the resulting performance can vary. In this paper, we explore different ways of manually designing curricula for RL in real-time strategy game StarCraft Ⅱ. We propose four generalised methods of manually creating curricula and verify their effectiveness through experiments. Our results show that all four of our proposed methods can improve a RL agent's learning process when used correctly. We demonstrate that using subtasks, or modifying the state space of the tasks, is the most effective way to create training samples for StarCraft Ⅱ. We found that utilising subtasks during training consistently accelerated the learning process of the agent and improved the agent's final performance.
机译:加固学习(RL)已成功在游戏中取得成功,但与其他形式的机器学习相比,患有长期培训时间。课程学习,一种优化技术,通过以有意义的顺序呈现培训样本来提高模型学习的能力,称为课程,可以提供解决方案。由于自动化课程生成所涉及的限制,课程通常是手动设计的。然而,由于缺乏对课程的有效设计的研究,研究人员经常依赖直觉,由此产生的表现可能会有所不同。在本文中,我们在实时策略游戏星际争霸中探讨了用于RL的课程的不同方式。我们提出了四种手动创建课程的一般方法,并通过实验验证其有效性。我们的研究结果表明,我们所提出的四种方法可以在正确使用时改善RL代理的学习过程。我们演示使用子任务或修改任务的状态空间,是创建星际争霸Ⅱ的培训样本的最有效方法。我们发现在培训期间利用子特提比始终加速了代理人的学习过程,并改善了代理人的最终表现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号