A Reinforcement-Learning Algorithm for Sampling Design in Markov Random Fields

机译：马氏随机场中抽样设计的强化学习算法

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Optimal sampling in spatial random fields is a complex problem, which mobilizes several research fields in spatial statistics and artificial intelligence. In this paper we consider the case where observations are discrete-valued and modelled by a Markov Random Field. Then we encode the sampling problem into the Markov Decision Process (MDP) framework. After exploring existing heuristic solutions as well as classical algorithms from the field of Reinforcement Learning (RL), we design an original algorithm, LSDP (Least Square Dynamic Programming), which uses simulated trajectories to solve approximately any finite-horizon MDP problem. Based on an empirical study of the behaviour of these different approaches on binary models, we derive the following conclusions: i) a naive heuristic, consisting in sampling sites where marginals are the most uncertain, is already an efficient sampling approach; ii) LSDP outperforms all the classical RL approaches we have tested; iii) LSDP outperforms the heuristic in cases when reconstruction errors have a high cost, or sampling actions are constrained. In addition, LSDP readily handles action costs in the optimisation problem, as well as cases when some sites of the MRF can not be observed.

机译：空间随机场中的最优采样是一个复杂的问题，它动员了空间统计和人工智能领域的几个研究领域。在本文中，我们考虑了观测值是离散值并由马尔可夫随机场建模的情况。然后，我们将采样问题编码到马尔可夫决策过程（MDP）框架中。在探究了现有的启发式解决方案以及强化学习（RL）领域的经典算法之后，我们设计了一种原始算法LSDP（最小二乘动态规划），该算法使用模拟轨迹来解决大约任何有限水平MDP问题。基于对这些不同方法在二元模型上的行为的经验研究，我们得出以下结论：i）朴素的启发式方法（由边际最不确定的采样点组成）已经是一种有效的采样方法； ii）LSDP优于我们测试过的所有经典RL方法； iii）在重建误差代价高昂或采样动作受到限制的情况下，LSDP优于启发式算法。此外，LSDP可以轻松处理优化问题中的动作成本，以及无法观察到MRF某些位置的情况。

著录项

来源
《20th European conference on artificial intelligence》|2012年|181-186|共6页
会议地点 Montpellier(FR)
作者
Mathieu BONNEAU; Nathalie Peyrard; Regis Sabbadin;
展开▼
作者单位

INRA, UBIA UR875, BP 52627 - 31326 Castanet-Tolosan, France,INRA, UMR1347 Agroecologie, BP 86510-21065 Dijon, France;

INRA, UBIA UR875, BP 52627 - 31326 Castanet-Tolosan, France;

INRA, UBIA UR875, BP 52627 - 31326 Castanet-Tolosan, France;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Reconstruction of Markov random fields from samples: Some observations and algorithms [J] . Bresler G., Mossel E., Sly A. SIAM Journal on Computing . 2013,第2期

机译：从样本重构马尔可夫随机场：一些观察和算法
2. Reinforcement learning-based design of sampling policies under cost constraints in Markov random fields: Application to weed map reconstruction [J] . Mathieu Bonneau, Sabrina Gaba, Nathalie Peyrard, Computational statistics & data analysis . 2014,第Null期

机译：基于增强学习的马尔可夫随机域成本约束下的采样策略设计：在杂草图重建中的应用
3. Gaussian Markov Random Fields for Discrete Optimization via Simulation: Framework and Algorithms [J] . Salemi Peter L., Song Eunhye, Nelson Barry L., Operations Research: The Journal of the Operations Research Society of America . 2019,第1期

机译：通过模拟进行离散优化的Gaussian Markov随机字段：框架和算法
4. A Reinforcement-Learning Algorithm for Sampling Design in Markov Random Fields [C] . Mathieu BONNEAU, Nathalie Peyrard, Regis Sabbadin European Conference on Artificial Intelligence . 2012

机译：Markov随机字段采样设计的加强学习算法
5. Persistency Algorithms for Efficient Inference in Markov Random Fields [D] . Wang, Chen. 2018

机译：马尔可夫随机场中有效推理的持久性算法
6. Fully Bayesian Prediction Algorithms for Mobile Robotic Sensors under Uncertain Localization Using Gaussian Markov Random Fields [O] . Mahdi Jadaliha, Jinho Jeong, Yunfei Xu, 2018

机译：不确定定位下基于高斯马尔可夫随机场的移动机器人传感器全贝叶斯预测算法
7. Reconstruction of Markov random fields from samples: Some observations and algorithms [O] . Guy Bresler, Elchanan Mossel, Allan Sly 2008

机译：从样本重建马尔可夫随机场：一些观察和算法

A Reinforcement-Learning Algorithm for Sampling Design in Markov Random Fields

摘要

著录项

相似文献

相关主题

期刊订阅