Reachability and Differential Based Heuristics for Solving Markov Decision Processes

机译：用于解决马尔可夫决策过程的可达性和基于差分的启发式方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Decision-making in uncertain environments is a basic problem in the area of artificial intelligence, and Markov decision processes (MDPs) have become very popular for modeling non-deterministic planning problems with full observability. Specifically, an MDP assumes discrete states and discrete actions, and can be viewed as stochastic automata where an agent's actions have uncertain effects. Such uncertain action outcomes induce stochastic transitions between states. The expected value of a chosen action is a function of the transitions it induces. On executing the action, the agent receives a reward and also causes a change in the state of the environment. The objective of the agent is to perform actions in order to maximize the cumulative future reward over a period of time. In practice, the Value Iteration (VI) is probably the most famous and most widely used method for solving the MDPs.

机译：在不确定环境中的决策是人工智能领域的基本问题，而马尔可夫决策过程（MDPS）已经非常流行，用于以完全可观察性建模非确定性规划问题。具体地，MDP假设离散状态和离散动作，并且可以被视为随机自动机，其中代理的动作具有不确定的效果。这种不确定的动作结果诱导状态之间的随机转变。所选动作的预期值是它引起的过渡的函数。在执行动作时，代理收到奖励，并导致环境状况发生变化。代理的目标是执行行动，以便在一段时间内最大化累积的未来奖励。在实践中，价值迭代（VI）可能是解决MDP的最着名和最广泛使用的方法。

著录项

来源
《International Symposium on Robotics Research》|2020年|1071p|共18页
会议地点
作者
S. Debnath; G. Sukhatme; L. Liu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP24-53;
关键词

相似文献

外文文献
中文文献
专利

1. Sensitivity-based nested partitions for solving finite-horizon Markov decision processes [J] . Weiwei Chen Operations Research Letters: A Journal of the Operations Research Society of America . 2017,第5期

机译：基于灵敏度的嵌套分区，用于解决有限地平线马尔可夫决策过程
2. Optimal schedulers vs optimal bases: An approach for efficient exact solving of Markov decision processes [J] . Sergio Giro Theoretical computer science . 2014,第Null期

机译：最佳计划程序与最佳基准：有效精确求解马尔可夫决策过程的方法
3. Incremental Improvements of Heuristic Policies for Average-Reward Markov Decision Processes [J] . S. Reveliotis, M. Ibrahim IFAC PapersOnLine . 2020,第2期

机译：平均奖励马尔可夫决策过程的启发式政策的增量改进
4. Reachability and Differential Based Heuristics for Solving Markov Decision Processes [C] . S. Debnath, G. Sukhatme, L. Liu International Symposium on Robotics Research . 2020

机译：用于解决马尔可夫决策过程的可达性和基于差分的启发式方法
5. Randomized search methods for solving Markov decision processes and global optimization. [D] . Hu, Jiaqiao. 2006

机译：解决马尔可夫决策过程和全局优化的随机搜索方法。
6. Decision Making Under Uncertainty: A Neural Model Based on Partially Observable Markov Decision Processes [O] . Rajesh P. N. Rao 2010

机译：不确定性下的决策：基于部分可观察的马尔可夫决策过程的神经模型
7. The Complexity of Graph-Based Reductions for Reachability in Markov Decision Processes [O] . Roux, Stephane Le, Perez, Guillermo A. 2017

机译：基于图的马尔可夫可达性约简的复杂性决策过程

Reachability and Differential Based Heuristics for Solving Markov Decision Processes

摘要

著录项

相似文献

相关主题

期刊订阅