首页> 外文OA文献 >Investigation into the effect of social learning in reinforcement learning board game playing agents
【2h】

Investigation into the effect of social learning in reinforcement learning board game playing agents

机译:社会学习在强化学习型棋盘游戏代理商中的作用调查

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

This thesis presents the use of social learning to improve the performance of gameudplaying reinforcement learning agents. Agents are placed in a social learning environmentudas opposed to the Self-Play learning environment. Their performance is monitored andudanalysed in order to observe how the performance changes compared to Self-Play agents.udTwo case studies were conducted, one with the game Tic-Tac-Toe and the other with theudAfrican board game of Morabaraba. The Tic-Tac-Toe agents used a table based TD ( )udalgorithm to learn the Q values. The results from the tests for the Tic-Tac-Toe agentsudindicate that the social learning agents perform better than the Self-Play agents in bothudboard tests and competitive tests. By increasing the population sizes of the agents theudnumber of superior social agents also increases as well as improvements in their skilludlevel. In the second case study the agents use function approximation and the TD ( )udalgorithm because of a larger number of states. The social agents performed better thanudthe Self-Play agents in the board tests and are not superior in the test where they competeudagainst each other. Larger populations were not possible with the Morabaraba agents butudthe results are still positive as the agents perform well in the board tests.
机译:本文提出了利用社交学习来提高游戏玩法强化学习主体的绩效。代理商被置于与自学学习环境相对的社交学习环境中。监视和分析他们的表现,以便观察与自玩代理相比性能如何变化。 ud进行了两个案例研究,一个案例是Tic-Tac-Toe游戏,另一个案例是Morabaraba的非洲棋盘游戏。 Tic-Tac-Toe代理使用基于表的TD() udalgorithm来学习Q值。井字游戏代理商的测试结果表明,社交学习代理商在udboard测试和竞争性测试中的表现均优于Self-Play代理商。通过增加代理人的人口规模,高级社会代理人的数量也将增加,他们的技能水平也会提高。在第二个案例研究中,由于状态数量较多,代理使用函数逼近和TD() udalgorithm。社交角色在董事会测试中的表现优于“自玩型”代理,并且在相互竞争的测试中也不占优势。 Morabaraba代理商不可能实现更大的种群数量,但是结果仍然是积极的,因为代理商在董事会测试中表现良好。

著录项

  • 作者

    Marivate Vukosi Ntsakisi;

  • 作者单位
  • 年度 2009
  • 总页数
  • 原文格式 PDF
  • 正文语种 en
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号