首页> 外国专利> AUTOMATED REINFORCEMENT-LEARNING-BASED APPLICATION MANAGER THAT LEARNS AND IMPROVES A REWARD FUNCTION

AUTOMATED REINFORCEMENT-LEARNING-BASED APPLICATION MANAGER THAT LEARNS AND IMPROVES A REWARD FUNCTION

机译:基于自动学习的学习管理和奖励功能的应用程序管理器

摘要

The current document is directed to automated reinforcement-learning-based application managers that learn and improve the reward function that steers reinforcement-learning-based systems towards optimal or near-optimal policies. Initially, when the automated reinforcement-learning-based application manager is first installed and launched, the automated reinforcement-learning-based application manager may rely on human-application-manager action inputs and resulting state/action trajectories to accumulate sufficient information to generate an initial reward function. During subsequent operation, when it is determined that the automated reinforcement-learning-based application manager is no longer following a policy consistent with the type of management desired by human application managers, the automated reinforcement-learning-based application manager may use accumulated trajectories to improve the reward function.
机译:当前文档针对基于增强学习的自动化应用程序管理器,该应用程序管理器可以学习和改进奖励功能,从而使基于增强学习的系统转向最佳或接近最优的策略。最初,当首次安装和启动基于自动增强学习的应用程序管理器时,基于自动增强学习的应用程序管理器可能会依赖于人类应用程序管理器的动作输入和结果状态/动作轨迹来积累足够的信息以生成一个初始奖励功能。在后续操作期间,当确定基于自动增强学习的应用程序管理器不再遵循与人类应用程序管理器期望的管理类型一致的策略时,基于自动增强学习的应用程序管理器可以使用累积的轨迹来完善奖励功能。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号