首页> 外国专利> Device and procedure for training a control strategy for a control device over multiple iterations

Device and procedure for training a control strategy for a control device over multiple iterations

机译:用于培训多个迭代控制设备的控制策略的设备和过程

摘要

A design describes a procedure to train a control strategy for a control over multiple iterations, defining an exploration strategy for each iteration to an up-to-date version of the control strategy, conducting multiple simulation runs,where, for each simulation run, an action is selected for each state of a sequence of states beginning with an initial state of the simulation run for as long as the selected action is safe according to the exploration strategy,until a secure action has been selected or a maximum number equal to two of actions has been selected, the state of follow-up of the condition following states is determined by simulation when performing the selected action, if a secure action has been selected; orif, until the maximum number is reached in accordance with the strategy, no safe action has been selected, the simulation run is interrupted or a specified safe action is selected, if any;and the state of follow-up of the condition following states is determined by simulation when performing the selected safe action;the sequence of states with the selected actions and rewards received in the states are collected as simulation flow data, for which the iteration of the value of a loss function is determined over the data of the simulation runs performed and the control strategy is adapted to a new version,reducing the loss function value.
机译:一种设计描述了一种培训控制策略的过程,用于控制多个迭代的控制,定义每个迭代到控制策略的最新版本的探索策略,进行多个模拟运行,其中每个模拟运行,对于从初始状态开始的初始状态的状态的每个状态选择操作,只要根据探索策略即可安全地安全,直到选择了安全操作或最大数量等于两个如果选择了安全操作,则通过仿真确定了操作之后的状态,确定了操作的后续状态,如果已选择安全操作,则通过模拟确定状态; orif,直到达到最大数量,直到符合策略,没有选择安全动作,仿真运行被中断或选择了指定的安全操作,如果有的话;以及状态下的状态的后续状态是在执行所选择的安全动作时通过模拟决定;在态中收集具有所选动作和奖励的状态的序列作为模拟流数据,因为模拟流数据,用于在模拟数据的数据上确定丢失函数的值的迭代执行的运行和控制策略适用于新版本,从而降低损耗函数值。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号