Exploiting Key Events for Learning Interception Policies

机译：利用学习拦截政策的关键事件

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

One scenario that commonly arises in computer games and military training simulations is predator-prey pursuit in which the goal of the non-player character agent is to successfully intercept a fleeing player. In this paper, we focus on a variant of the problem in which the agent does not have perfect information about the player's location but has prior experience in combating the player. Effectively addressing this problem requires a combination of learning the opponent's tactics while planning an interception strategy. Although for small maps, solving the problem with standard POMDP (Partially Observable Markov Decision Process) solvers is feasible, increasing the search area renders many standard techniques intractable due to the increase in the belief state size and required plan length. Here we introduce a new approach for solving the problem on large maps that exploits key events, high reward regions in the belief state discovered at the higher level of ion, to plan efficiently over the low-level map. We demonstrate that our hierarchical key-events planner can learn intercept policies from traces of previous pursuits significantly faster than a standard point-based POMDP solver, particularly as the maps scale in size.

机译：一种常见于电脑游戏和军事训练模拟的情景是捕食者 - 猎物追求，其中非球员性格代理的目标是成功拦截逃离的播放器。在本文中，我们专注于该问题的变体，其中代理商没有关于玩家的位置的完美信息，而是在打击玩家的经验。有效解决这个问题需要在规划拦截策略的同时学习对手的策略。虽然对于小地图，解决标准POMDP（部分可观察马尔可夫决策过程）求解器是可行的，因此增加搜索区域呈现许多标准技术由于信仰状态大小的增加和所需的计划长度而难以置力。在这里，我们介绍了一种解决问题的新方法，用于解决大型地图上的问题，该问题利用了在较高的离子水平上发现的信仰状态下的高奖励区域，以有效地在低级地图上规划。我们证明，我们的分层关键事件规划师可以从先前追求的迹线比基于标准点的POMDP求解器更快地学习拦截策略，特别是作为大小的地图比例。

著录项

来源
《International Florida Artificial Intelligence Research Society Conference》|2013年||共6页
会议地点
作者
Yuan Chang; Gita Sukthankar;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. "Events and failures are our only means for making policy changes": learning in disaster and emergency management policies in Manitoba, Canada [J] . Haque C. Emdad, Choudhury Mahed-Ul-Islam, Sikder Md Sowayib Natural Hazards . 2019,第1期

机译：“事件和失败是我们制定政策变革的唯一手段”：加拿大曼尼托巴的灾害和紧急管理政策学习
2. Learning from adverse events in the nuclear power industry: Organizational learning, policy making and normalization [J] . Johan M. Sanne Technology in society . 2012,第3期

机译：从核电行业的不利事件中学习：组织学习，政策制定和规范化
3. Fast adaptation of the internal model of gravity for manual interceptions: evidence for event-dependent learning. [J] . Zago M, Bosco G, Maffei V, Journal of Neurophysiology . 2005,第2期

机译：重力内部模型快速适应人工拦截：事件相关学习的证据。
4. Exploiting Key Events for Learning Interception Policies [C] . Yuan Chang, Gita Sukthankar Proceedings of the Twenty-Sixth international Florida Artificial Intelligence Research Society Conference . 2013

机译：利用关键事件学习拦截策略
5. Entrepreneurial learning: Exploring unexpected key events in the post-startup period. [D] . Duffy, Susan G. 2007

机译：创业学习：在创业后阶段探索意外的关键事件。
6. Exploiting machine learning for predicting skeletal-related events in cancer patients with bone metastases [O] . Zhiyu Wang, Xiaoting Wen, Yaohong Lu, 2016

机译：利用机器学习预测患有骨转移的癌症患者的骨骼相关事件
7. Fast adaptation of the internal model of gravity for manual interceptions: Evidence for event-dependent learning [O] . Zago M, Bosco G, Maffei V, 2005

机译：快速调整引力内部模型以进行人工拦截：基于事件的学习证据

Exploiting Key Events for Learning Interception Policies

摘要

著录项

相似文献

相关主题

期刊订阅