首页> 外国专利> Reinforcement learning method, reinforcement learning program, and reinforcement learning device

Reinforcement learning method, reinforcement learning program, and reinforcement learning device

机译:强化学习方法,强化学习程序和强化学习装置

摘要

PROBLEM TO BE SOLVED: To improve learning efficiency by reinforcement learning. SOLUTION: A value function learning unit 403 performs a unit learning step, and learns a value function based on the received state of the wind power generation facility 400, the reward of the wind power generation facility 400, and the action to the wind power generation facility 400. To do. The experience level calculation unit 404 updates the experience level function based on the received state of the wind power generation facility 400, the reward of the wind power generation facility 400, and the action on the wind power generation facility 400. The experience degree calculation unit 404 calculates the experience degree of the current state or action of the wind power generation facility 400 and the experience degree of another state or action based on the experience degree function. The value function correction unit 405 determines whether to further update the value function based on the value function and the experience level. When determining that the value function is to be updated, the value function correction unit 405 uses monotonicity to update the value function based on the value function and the experience level. [Selection diagram] Fig. 4
机译:要解决的问题:通过加强学习来提高学习效率。解决方案:价值函数学习单元403执行单元学习步骤,并基于风力发电设施400的接收状态,风力发电设施400的报酬以及对风力发电的动作来学习价值函数。设施400。经验等级计算单元404基于风力发电设施400的接收状态,风力发电设施400的奖励以及对风力发电设施400的动作来更新经验等级功能。经验度计算单元404基于经验度函数计算风力发电设施400的当前状态或动作的经验度以及另一状态或动作的经验度。价值函数校正单元405基于价值函数和经验水平来确定是否进一步更新价值函数。当确定将要更新值函数时,值函数校正单元405基于值函数和经验水平,使用单调性来更新值函数。 [选择图]图4

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号