首页> 外国专利> Automated reinforcement-learning-based application manager that learns and improves a reward function

Automated reinforcement-learning-based application manager that learns and improves a reward function

机译:自动化加强基于学习的应用程序管理器,用于学习和提高奖励功能

摘要

The current document is directed to automated reinforcement-learning-based application managers that learn and improve the reward function that steers reinforcement-learning-based systems towards optimal or near-optimal policies. Initially, when the automated reinforcement-learning-based application manager is first installed and launched, the automated reinforcement-learning-based application manager may rely on human-application-manager action inputs and resulting state/action trajectories to accumulate sufficient information to generate an initial reward function. During subsequent operation, when it is determined that the automated reinforcement-learning-based application manager is no longer following a policy consistent with the type of management desired by human application managers, the automated reinforcement-learning-based application manager may use accumulated trajectories to improve the reward function.
机译:目前的文件旨在自动化加强基于学习的应用管理人员,学习和提高奖励函数,使基于加强学习的系统朝向最佳或接近最佳政策。最初,当首次安装和启动自动化加强基于学习的应用程序管理器时,自动化钢筋基于学习的应用程序管理器可以依赖于人类应用程序管理器操作输入和产生的状态/动作轨迹来累积足够的信息以生成一个初始奖励功能。在随后的操作期间,当确定基于自动化的基于钢筋基于学习的应用程序管理器不再遵循与人类应用程序管理器所需的管理类型一致的策略,基于自动化的加强学习的应用程序管理器可以使用累积的轨迹提高奖励功能。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号