首页> 外国专利> USING CYCLIC MARKOV DECISION PROCESS TO DETERMINE OPTIMUM POLICY

USING CYCLIC MARKOV DECISION PROCESS TO DETERMINE OPTIMUM POLICY

机译：使用循环马尔可夫决策过程确定最佳策略

页面导航

摘要
著录项
相似文献

摘要

A method for determining an optimum policy by using a Markov decision process in which T subspaces each have at least one state having a cyclic structure includes identifying, with a processor, subspaces that are part of a state space; selecting a t-th (t is a natural number, t≦T) subspace among the identified subspaces; computing a probability of, and an expected value of a cost of, reaching from one or more states in the selected t-th subspace to one or more states in the t-th subspace in a following cycle; and recursively computing a value and an expected value of a cost based on the computed probability and expected value of the cost, in a sequential manner starting from a (t−1)-th subspace.

机译：一种通过使用马尔可夫决策过程来确定最佳策略的方法，其中每个T个子空间都具有至少一个具有循环结构的状态，该方法包括：使用处理器识别作为状态空间一部分的子空间;在识别出的子空间中选择第t个子空间（t是自然数，t≤T）;计算在随后的周期中从所选的第t个子空间中的一个或多个状态到第t个子空间中的一个或多个状态的概率和成本的期望值;从第（t-1）个子空间开始，基于计算出的成本的概率和期望值，以递归的方式递归地计算成本的值和期望值。

著录项

公开/公告号US2013085983A1

专利类型
公开/公告日2013-04-04

原文格式PDF
申请/专利权人 TAKAYUKI OSOGAMI;RAYMOND H. RUDY;
展开▼

申请/专利号US201213586385
发明设计人 RAYMOND H. RUDY;TAKAYUKI OSOGAMI;
展开▼

申请日2012-08-15
分类号G06N5/02;
国家 US
入库时间 2022-08-21 16:47:30

相似文献

专利
外文文献
中文文献