...
首页> 外文期刊>Kybernetika >RISK-SENSITIVE AVERAGE OPTIMALITY IN MARKOV DECISION PROCESSES
【24h】

RISK-SENSITIVE AVERAGE OPTIMALITY IN MARKOV DECISION PROCESSES

机译:马氏决策过程中的风险敏感平均最优

获取原文
获取原文并翻译 | 示例

摘要

In this note attention is focused on finding policies optimizing risk-sensitive optimality criteria in Markov decision chains. To this end we assume that the total reward generated by the Markov process is evaluated by an exponential utility function with a given risk-sensitive coefficient. The ratio of the first two moments depends on the value of the risk-sensitive coefficient; if the risk-sensitive coefficient is equal to zero we speak on risk-neutral models. Observe that the first moment of the generated reward corresponds to the expectation of the total reward and the second central moment of the reward variance.For communicating Markov processes and for some specific classes of unichain processes long run risk-sensitive average reward is independent of the starting state. In this note we present necessary and sufficient condition for existence of optimal policies independent of the starting state in unichain models and characterize the class of average risk-sensitive optimal policies.
机译:在本文中,注意力集中在寻找在马尔可夫决策链中优化风险敏感最优标准的政策。为此,我们假设由马尔可夫过程产生的总回报是由具有给定风险敏感系数的指数效用函数评估的。前两个时刻的比率取决于风险敏感系数的值;如果风险敏感系数等于零,我们将使用风险中立模型。观察到,所产生的报酬的第一时刻对应于总报酬的期望值和报酬方差的第二中心时刻。对于沟通马尔可夫过程和某些特定类型的单链过程,长期的风险敏感平均报酬与起始状态。在本说明中,我们提出了与单链模型中的初始状态无关的最优策略的存在的充要条件,并描述了平均风险敏感型最优策略的类别。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号