首页> 外文期刊>Computational Intelligence and AI in Games, IEEE Transactions on >On Scalability, Generalization, and Hybridization of Coevolutionary Learning: A Case Study for Othello
【24h】

On Scalability, Generalization, and Hybridization of Coevolutionary Learning: A Case Study for Othello

机译:协同学习的可扩展性,推广性和混合性:以奥赛罗为例

获取原文
获取原文并翻译 | 示例

摘要

This study investigates different methods of learning to play the game of Othello. The main questions posed concern scalability of algorithms with respect to the search space size and their capability to generalize and produce players that fare well against various opponents. The considered algorithms represent strategies as $n$-tuple networks, and employ self-play temporal difference learning (TDL), evolutionary learning (EL) and coevolutionary learning (CEL), and hybrids thereof. To assess the performance, three different measures are used: score against an a priori given opponent (a fixed heuristic strategy), against opponents trained by other methods (round-robin tournament), and against the top-ranked players from the online Othello League. We demonstrate that although evolutionary-based methods yield players that fare best against a fixed heuristic player, it is the coevolutionary temporal difference learning (CTDL), a hybrid of coevolution and TDL, that generalizes better and proves superior when confronted with a pool of previously unseen opponents. Moreover, CTDL scales well with the size of representation, attaining better results for larger $n$ -tuple networks. By showing that a strategy learned in this way wins against the top entries from the Othello League, we conclude that it is one of the best 1-ply Othello players obtained to date without explicit use of human knowledge.
机译:这项研究调查了学习玩奥赛罗游戏的不同方法。提出的主要问题涉及算法在搜索空间大小方面的可伸缩性,以及它们概括和产生与各种对手表现良好的玩家的能力。所考虑的算法将策略表示为元组网络,并采用自玩时间差异学习(TDL),进化学习(EL)和协进化学习(CEL)及其混合。为了评估性能,使用了三种不同的度量:对先验给定对手的得分(固定的启发式策略),对采用其他方法训练的对手的得分(循环锦标赛)以及对战在线奥赛罗联赛中排名靠前的球员。我们证明,尽管基于进化的方法所产生的参与者相对于固定的启发式参与者而言表现最好,但协同进化和TDL的混合体-协同进化时差学习(CTDL)能够更好地概括并在面对以前的经验时证明是卓越的看不见的对手。而且,CTDL可以随着表示的大小而很好地扩展,对于较大的$ n $元组网络可以获得更好的结果。通过证明以这种方式学到的策略与奥赛罗联赛的头等奖相抗衡,我们得出的结论是,这是迄今为止未经明确使用人类知识而获得的最佳奥赛罗1层球员之一。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号