首页> 外国专利> Reinforcement learning method, reinforcement learning program, and reinforcement learning device

Reinforcement learning method, reinforcement learning program, and reinforcement learning device

机译：强化学习方法，强化学习程序和强化学习装置

页面导航

摘要
著录项
相似文献

摘要

PROBLEM TO BE SOLVED: To improve learning efficiency by reinforcement learning. SOLUTION: A value function learning unit 403 performs a unit learning step, and learns a value function based on the received state of the wind power generation facility 400, the reward of the wind power generation facility 400, and the action to the wind power generation facility 400. To do. The experience level calculation unit 404 updates the experience level function based on the received state of the wind power generation facility 400, the reward of the wind power generation facility 400, and the action on the wind power generation facility 400. The experience degree calculation unit 404 calculates the experience degree of the current state or action of the wind power generation facility 400 and the experience degree of another state or action based on the experience degree function. The value function correction unit 405 determines whether to further update the value function based on the value function and the experience level. When determining that the value function is to be updated, the value function correction unit 405 uses monotonicity to update the value function based on the value function and the experience level. [Selection diagram] Fig. 4

机译：要解决的问题：通过加强学习来提高学习效率。解决方案：价值函数学习单元403执行单元学习步骤，并基于风力发电设施400的接收状态，风力发电设施400的报酬以及对风力发电的动作来学习价值函数。设施400。经验等级计算单元404基于风力发电设施400的接收状态，风力发电设施400的奖励以及对风力发电设施400的动作来更新经验等级功能。经验度计算单元404基于经验度函数计算风力发电设施400的当前状态或动作的经验度以及另一状态或动作的经验度。价值函数校正单元405基于价值函数和经验水平来确定是否进一步更新价值函数。当确定将要更新值函数时，值函数校正单元405基于值函数和经验水平，使用单调性来更新值函数。 [选择图]图4

著录项

公开/公告号JP2020119139A

专利类型
公开/公告日2020-08-06

原文格式PDF
申请/专利权人富士通株式会社;
展开▼

申请/专利号JP20190008512
发明设计人重住淳一;岩根秀直;屋並仁史;
展开▼

申请日2019-01-22
分类号G06N20;
国家 JP
入库时间 2022-08-21 11:36:08

相似文献

专利
外文文献
中文文献