首页> 外国专利> State transition rule acquisition device, action selection learning device, action selection device, state transition rule acquisition method, action selection method, and program

State transition rule acquisition device, action selection learning device, action selection device, state transition rule acquisition method, action selection method, and program

机译：状态转移规则获取装置，动作选择学习装置，动作选择装置，状态转移规则获取方法，动作选择方法和程序

页面导航

摘要
著录项
相似文献

摘要

An object of the present invention is to obtain a state or a state transition rule for selecting an action even in an environment where the state or the state transition rule is unknown. A state acquisition unit acquires an environmental state after an action when a selected action is performed, and a reward calculation unit performs a state based on the acquired state and the selected action. The reward at the time of performing the action is calculated, and the parameter updating unit 240 takes the state as an input based on the selected action and the reward, updates the parameters of the model for selecting the action, and the action selecting unit 270 However, with the post-action state as input, the model is used to select the action, and acquisition, calculation, updating, and selection are repeated until the iteration end condition is satisfied, and the state acquisition unit 210 has already acquired the state. Compared with the acquired set of states, if the state is new, the acquired state is added to the set of states, and the state transition rule is acquired based on the set of states. [Selected figure] Figure 2

机译：本发明的目的是获得即使在状态或状态转移规则未知的环境中也可以选择动作的状态或状态转移规则。状态获取单元在执行所选择的动作时在动作之后获取环境状态，并且奖励计算单元基于所获取的状态和所选择的动作来执行状态。计算执行动作时的奖励，并且参数更新单元240基于所选择的动作和奖励将状态作为输入，更新用于选择动作的模型的参数，以及动作选择单元270然而，以动作后状态作为输入，使用模型来选择动作，并且重复获取，计算，更新和选择，直到满足迭代结束条件为止，并且状态获取单元210已经获取了状态。。与获取的状态集相比，如果状态是新的，则将获取的状态添加到状态集，并基于状态集获取状态转移规则。 [选定图]图2

著录项

公开/公告号JP2019079227A

专利类型
公开/公告日2019-05-23

原文格式PDF
申请/专利权人 NIPPON TELEGR & TELEPH CORP NTT;UNIV OF TOKYO;
展开▼

申请/专利号JP20170205050
发明设计人鈴木潤;鶴岡慶雅;
展开▼

申请日2017-10-24
分类号G06N3/08;G06N3/04;G06N20;
国家 JP
入库时间 2022-08-21 12:24:09

相似文献

专利
外文文献
中文文献