Learning to Play Using Low-Complexity Rule-Based Policies: Illustrations through Ms. Pac-Man

Istvan Szita; Andras Lorincz

首页> 外文期刊>The Journal of Artificial Intelligence Research >Learning to Play Using Low-Complexity Rule-Based Policies: Illustrations through Ms. Pac-Man

【24h】

Learning to Play Using Low-Complexity Rule-Based Policies: Illustrations through Ms. Pac-Man

机译：使用基于低复杂度的基于规则的策略学习游戏：Pac-Man女士的插图

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this article we propose a method that can deal with certain combinatorial reinforcement learning tasks. We demonstrate the approach in the popular Ms. Pac-Man game. We define a set of high-level observation and action modules, from which rule-based policies are constructed automatically. In these policies, actions are temporally extended, and may work concurrently. The policy of the agent is encoded by a compact decision list. The components of the list are selected from a large pool of rules, which can be either hand-crafted or generated automatically. A suitable selection of rules is learnt by the cross-entropy method, a recent global optimization algorithm that fits our framework smoothly. Cross-entropy-optimized policies perform better than our hand-crafted policy, and reach the score of average human players. We argue that learning is successful mainly because (i) policies may apply concurrent actions and thus the policy space is sufficiently rich, (ii) the search is biased towards low-complexity policies and therefore, solutions with a compact description can be found quickly if they exist.

机译：在本文中，我们提出了一种可以处理某些组合强化学习任务的方法。我们在受欢迎的《吃豆人》游戏中演示了这种方法。我们定义了一组高级观察和操作模块，从中可以自动构建基于规则的策略。在这些策略中，操作会暂时扩展，并且可能会同时工作。代理的策略由紧凑的决策列表编码。列表的组成部分是从大量规则中选择的，这些规则可以手工制作也可以自动生成。交叉熵法是一种合适的规则选择方法，交叉熵法是一种最近的全局优化算法，可以很好地适应我们的框架。交叉熵优化的策略比我们手工制定的策略表现更好，并且达到了普通玩家的得分。我们认为学习是成功的，主要是因为（i）策略可以应用并发操作，因此策略空间足够丰富;（ii）搜索偏向于低复杂度策略，因此，如果存在以下问题，则可以快速找到具有简洁描述的解决方案它们存在。

著录项

来源
《The Journal of Artificial Intelligence Research》 |2007年第0期|共26页
作者
Istvan Szita; Andras Lorincz;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类人工智能理论;
关键词
入库时间 2022-08-18 18:15:44

相似文献

外文文献
中文文献
专利

1. Learning to Play Using Low-Complexity Rule-Based Policies: Illustrations through Ms. Pac-Man [J] . Lorincz A., Szita I. The Journal of Artificial Intelligence Research . 2007,第12期

机译：使用基于低复杂度的基于规则的策略学习游戏：Pac-Man女士的插图
2. Learning to Play Using Low-Complexity Rule-Based Policies: Illustrations through Ms. Pac-Man [J] . Istvan Szita, Andras Lorincz The Journal of Artificial Intelligence Research . 2007,第0期

机译：使用基于低复杂度的基于规则的策略学习游戏：Pac-Man女士的插图
3. Learning to Play Using Low-Complexity Rule-Based Policies: Illustrations through Ms. Pac-Man [J] . I. Szita, A. Lorincz Journal of Automation, Mobile Robotics & Intelligent Systems . 2007,第1期

机译：使用基于低复杂度的基于规则的策略学习游戏：Pac-Man女士的插图
4. Evolution versus Temporal Difference Learning for Learning to Play Ms. Pac-Man [C] . Peter Burrow, Simon M. Lucas IEEE Symposium on Computational Intelligence and Games . 2009

机译：进化与颞差学习学习玩Pac-Man女士
5. Navigating Through U.S.-China Policy: An Investigation of China National Off-Shore Oil Corporation's Proposed Acquisition of Unocal Corporation and How Domestic Politics Plays a Role in Balancing U.S. Trade Policy with National Security Policy. [D] . Spencer, Vicki L. 2010

机译：浏览美中政策：对中国离岸石油公司拟收购优尼科公司的调查，以及国内政治如何在平衡美国贸易政策与国家安全政策中发挥作用。
6. Discovering Multimodal Behavior in Ms. Pac-Man through Evolution of Modular Neural Networks [O] . Jacob Schrum, Risto Miikkulainen -1

机译：通过模块化神经网络的发展发现吃豆女士的多峰行为
7. Low-complexity modular policies: learning to play Pac-Man and a new framework beyond MDPs [O] . Szita, Istvan, Lorincz, Andras 2006

机译：低复杂性模块化政策：学习玩pac-man和新游戏超出mDp的框架

Learning to Play Using Low-Complexity Rule-Based Policies: Illustrations through Ms. Pac-Man

摘要

著录项

相似文献

相关主题

期刊订阅