首页> 外国专利> TRAINING ACTION SELECTION NEURAL NETWORKS USING HINDSIGHT MODELLING

TRAINING ACTION SELECTION NEURAL NETWORKS USING HINDSIGHT MODELLING

机译:使用后古建模培训行动选择神经网络

摘要

A reinforcement learning method and system that selects actions to be performed by a reinforcement learning agent interacting with an environment. A causal model is implemented by a hindsight model neural network and trained using hindsight i.e. using future environment state trajectories. As the method and system does not have access to this future information when selecting an action, the hindsight model neural network is used to train a model neural network which is conditioned on data from current observations, which learns to predict an output of the hindsight model neural network.
机译:一种加强学习方法和系统,其选择要通过与环境交互的加强学习代理执行的动作。因果模型由后古模型神经网络实施,并使用后古训练,即使用未来的环境状态轨迹。由于方法和系统在选择一个动作时没有访问此未来信息,所以使用后可以模型神经网络用于训练模型神经网络,该模型神经网络在来自当前观察的数据上调节,这将学习预测后可以模型的输出神经网络。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号