...
首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Static and Dynamic Values of Computation in MCTS
【24h】

Static and Dynamic Values of Computation in MCTS

机译:MCT中计算的静态和动态值

获取原文
           

摘要

Monte-Carlo Tree Search (MCTS) is one of the most-widely used methodsfor planning, and has powered many recent advances in artificialintelligence. In MCTS, one typically performs computations(i.e., simulations) to collect statistics about the possible futureconsequences of actions, and then chooses accordingly. Manypopular MCTS methods such as UCT and its variants decide whichcomputations to perform by trading-off exploration and exploitation. Inthis work, we take a more direct approach, and explicitly quantify thevalue of a computation based on its expected impact on the quality ofthe action eventually chosen. Our approach goes beyond the emph{myopic}limitations of existing computation-value-based methods in two senses:(I) we are able to account for the impact of non-immediate (ie, future)computations (II) on non-immediate actions. We show that policies thatgreedily optimize computation values are optimal under certainassumptions and obtain results that are competitive with thestate-of-the-art.
机译:Monte-Carlo树搜索(MCT)是策划最广泛使用的方法之一,并有助于近期易于智能化的进步。在MCT中,一​​个通常执行计算(即,仿真)来收集有关可能的未来行动的统计信息,然后选择相应的。单透视MCTS方法,如UCT及其变体,决定通过交易勘探和剥削来执行哪种跟踪。 Inthis工作,我们采取了更直接的方法,并根据其对最终选择的行动质量的预期影响明确地量化计算的价值。我们的方法超出了两种感官中现有计算值的方法的 emph {近视}限制:(i)我们能够考虑非立即(即,未来)计算(ii)对非的影响即时行动。我们表明,在特定assAssumptions下,策略优化计算值是最佳的,并获得与最新竞争竞争的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号