首页> 外国专利> System and Method for Reinforcement Learning Supporting Delayed Rewards

System and Method for Reinforcement Learning Supporting Delayed Rewards

机译:强化学习支持延迟奖励的系统和方法

摘要

The present invention relates to a reinforcement learning method for supporting delay compensation in a reinforcement learning system. The reinforcement learning method using the reinforcement learning agent of the present invention comprises the steps of receiving an immediate compensation value and a delay compensation value associated with a control action from an environmental system, and taking into account the received immediate compensation value and the delay compensation value to control the Generating a final reward value corresponding to the action, generating a transition tuple including the final reward value, and applying the generated transition tuple to the reinforcement learning agent to perform reinforcement learning. According to an embodiment of the present invention, since the delay compensation value measured by being delayed in the environmental system can be applied to the directly related control action, the performance and speed of the reinforcement learning system can be increased.
机译:增强学习方法技术领域本发明涉及一种用于在增强学习系统中支持延迟补偿的增强学习方法。使用本发明的强化学习代理的强化学习方法包括以下步骤:从环境系统接收与控制动作相关的立即补偿值和延迟补偿值,并考虑接收到的立即补偿值和延迟补偿。值以控制生成与该动作相对应的最终奖励值,生成包括最终奖励值的过渡元组,并将生成的过渡元组应用于强化学习代理以执行强化学习。根据本发明的实施例,由于可以将通过在环境系统中延迟而测量的延迟补偿值应用于直接相关的控制动作,因此可以提高强化学习系统的性能和速度。

著录项

相似文献

  • 专利
  • 外文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号