首页> 外文期刊>International Journal of Pattern Recognition and Artificial Intelligence >INTERLEAVED VERSUS A PRIORI EXPLORATION FOR REPEATED NAVIGATION IN A PARTIALLY-KNOWN GRAPH
【24h】

INTERLEAVED VERSUS A PRIORI EXPLORATION FOR REPEATED NAVIGATION IN A PARTIALLY-KNOWN GRAPH

机译:交错与先验探索,用于部分已知图中的重复导航

获取原文
获取原文并翻译 | 示例
       

摘要

In this paper, we address the tradeoff between exploration and exploitation for agents which need to learn more about the structure of their environment in order to perform more effectively. For example, a software agent operating on the World Wide Web may need to learn which sites on the net are most useful, and the most efficient routes to those sites. We compare exploration strategies for a repeated task, where the agent is given some particular task to perform some number of times. Tasks are modeled as navigation on a partially known (deterministic) graph. This paper describes a new utilitybased exploration algorithm for repeated tasks which interleaves exploration with task performance. The method takes into account both the costs and the potential benefits (for future task repetitions) of different exploratory actions. Exploration is performed in a greedy fashion, with the locally optimal exploratory action performed during repetition of each task. We experimentally evaluated our utility-based interleaved exploration algorithm against a heuristic search algorithm for exploration before task performance (a priori exploration) as well as a randomized interleaved exploration algorithm. We found that for a single repeated task, utility-based interleaved exploration consistently outperforms the alternatives, unless the number of task repetitions is very high. In addition, we extended the algorithms for the case of multiple repeated tasks, where the agent has a different, randomly-chosen task (from a known subset of possible tasks) to perform each time. Here too, we found t
机译:在本文中,我们解决了代理商探索和开发之间的权衡问题,代理商需要更多地了解其环境结构以更有效地执行任务。例如,在万维网上运行的软件代理可能需要了解网络上哪些站点最有用,以及通往这些站点的最有效路由。我们比较重复任务的探索策略,在重复任务中,代理被赋予执行某些特定任务的次数。任务被建模为在部分已知(确定性)图上导航。本文介绍了一种新的基于实用程序的重复任务探索算法,该算法将探索与任务性能交织在一起。该方法同时考虑了成本和不同探索行动的潜在收益(对于将来的任务重复)。探索以贪婪的方式进行,在重复执行每个任务期间执行局部最佳的探索性动作。我们针对任务执行之前的探索(先验探索)和随机交错探索算法,通过启发式搜索算法对基于实用程序的交错探索算法进行了实验评估。我们发现,对于单个重复任务,除非任务重复的次数非常多,否则基于实用程序的交错探索始终胜过其他选择。此外,我们针对多个重复任务的情况扩展了算法,其中代理具有每次执行的不同随机选择任务(来自可能任务的已知子集)。在这里,我们发现

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号