首页> 外国专利> USING CYCLIC MARKOV DECISION PROCESS TO DETERMINE OPTIMUM POLICY

USING CYCLIC MARKOV DECISION PROCESS TO DETERMINE OPTIMUM POLICY

机译:使用循环马尔可夫决策过程确定最佳策略

摘要

A method for determining an optimum policy by using a Markov decision process in which T subspaces each have at least one state having a cyclic structure includes identifying, with a processor, subspaces that are part of a state space; selecting a t-th (t is a natural number, t≦T) subspace among the identified subspaces; computing a probability of, and an expected value of a cost of, reaching from one or more states in the selected t-th subspace to one or more states in the t-th subspace in a following cycle; and recursively computing a value and an expected value of a cost based on the computed probability and expected value of the cost, in a sequential manner starting from a (t−1)-th subspace.
机译:一种通过使用马尔可夫决策过程来确定最佳策略的方法,其中每个T个子空间都具有至少一个具有循环结构的状态,该方法包括:使用处理器识别作为状态空间一部分的子空间;在识别出的子空间中选择第t个子空间(t是自然数,t≤T);计算在随后的周期中从所选的第t个子空间中的一个或多个状态到第t个子空间中的一个或多个状态的概率和成本的期望值;从第(t-1)个子空间开始,基于计算出的成本的概率和期望值,以递归的方式递归地计算成本的值和期望值。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号