首页> 外文会议>European Conference on Artificial Intelligence >A Reinforcement-Learning Algorithm for Sampling Design in Markov Random Fields
【24h】

A Reinforcement-Learning Algorithm for Sampling Design in Markov Random Fields

机译:Markov随机字段采样设计的加强学习算法

获取原文

摘要

Optimal sampling in spatial random fields is a complex problem, which mobilizes several research fields in spatial statistics and artificial intelligence. In this paper we consider the case where observations are discrete-valued and modelled by a Markov Random Field. Then we encode the sampling problem into the Markov Decision Process (MDP) framework. After exploring existing heuristic solutions as well as classical algorithms from the field of Reinforcement Learning (RL), we design an original algorithm, LSDP (Least Square Dynamic Programming), which uses simulated trajectories to solve approximately any finite-horizon MDP problem. Based on an empirical study of the behaviour of these different approaches on binary models, we derive the following conclusions: i) a naive heuristic, consisting in sampling sites where marginals are the most uncertain, is already an efficient sampling approach; ii) LSDP outperforms all the classical RL approaches we have tested; iii) LSDP outperforms the heuristic in cases when reconstruction errors have a high cost, or sampling actions are constrained. In addition, LSDP readily handles action costs in the optimisation problem, as well as cases when some sites of the MRF can not be observed.
机译:空间随机字段中的最佳采样是一个复杂的问题,它在空间统计和人工智能中调动了几个研究领域。在本文中,我们考虑了观察分离和由马尔可夫随机场建模的情况。然后我们将采样问题编码为Markov决策过程(MDP)框架。在探索现有的启发式解决方案以及钢筋学习领域的经典算法之后,我们设计了一种原始算法,LSDP(最小二乘动态编程),它使用模拟轨迹来解决大约任何有限的地平线MDP问题。基于对二元模型的这些不同方法行为的实证研究,我们得出了以下结论:i)一个天真的启发式,包括在边缘最不确定的采样网站上,已经是一种有效的采样方法; ii)LSDP优于我们测试的所有经典RL方法; III)LSDP在重建误差具有高成本时突出的启发式,或者采样行动受到约束。此外,LSDP还容易处理优化问题中的动作成本,以及不能观察到MRF的某些网站的情况。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号