首页> 外文OA文献 >Robust, risk-sensitive, and data-driven control of Markov Decision Processes
【2h】

Robust, risk-sensitive, and data-driven control of Markov Decision Processes

机译:马尔可夫决策过程的稳健,风险敏感和数据驱动控制

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Markov Decision Processes (MDPs) model problems of sequential decision-making under uncertainty. They have been studied and applied extensively. Nonetheless, there are two major barriers that still hinder the applicability of MDPs to many more practical decision making problems: * The decision maker is often lacking a reliable MDP model. Since the results obtained by dynamic programming are sensitive to the assumed MDP model, their relevance is challenged by model uncertainty. * The structural and computational results of dynamic programming (which deals with expected performance) have been extended with only limited success to accommodate risk-sensitive decision makers. In this thesis, we investigate two ways of dealing with uncertain MDPs and we develop a new connection between robust control of uncertain MDPs and risk-sensitive control of dynamical systems. The first approach assumes a model of model uncertainty and formulates the control of uncertain MDPs as a problem of decision-making under (model) uncertainty. We establish that most formulations are at least NP-hard and thus suffer from the "'curse of uncertainty." The worst-case control of MDPs with rectangular uncertainty sets is equivalent to a zero-sum game between the controller and nature.
机译:马尔可夫决策过程(MDP)对不确定性下的顺序决策问题进行建模。它们已经得到了广泛的研究和应用。尽管如此,仍然存在两个主要障碍,这些障碍仍然阻碍了MDP在许多实际决策问题上的适用性:*决策者通常缺乏可靠的MDP模型。由于通过动态编程获得的结果对假定的MDP模型敏感,因此它们的相关性受到模型不确定性的挑战。 *动态编程(处理预期的性能)的结构和计算结果仅获得了有限的成功,以适应风险敏感的决策者。在本文中,我们研究了两种处理不确定MDP的方法,并且在不确定MDP的鲁棒控制与动力学系统的风险敏感控制之间建立了新的联系。第一种方法假定模型不确定性模型,并将不确定性MDP的控制公式化为(模型)不确定性下的决策问题。我们确定,大多数公式至少具有NP难解性,因此会遭受“不确定性的诅咒”。具有矩形不确定性集的MDP的最坏情况控制等效于控制器与自然之间的零和博弈。

著录项

  • 作者

    Le Tallec Yann;

  • 作者单位
  • 年度 2007
  • 总页数
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号