首页> 外文期刊>IFAC PapersOnLine >Model-free safe reinforcement learning for chemical processes using Gaussian processes
【24h】

Model-free safe reinforcement learning for chemical processes using Gaussian processes

机译:使用高斯工艺的化学工艺无模型安全钢筋学习

获取原文
           

摘要

Model-free reinforcement learning has been recently investigated for use in chemical process control. Through the iterative creation of an approximate process model, control actions are able to be explored and optimal policies generated. Typically, this approximate process model has taken the form of a neural network that is continuously updated. However when small quantities of historical data are available, for example in novel processes, neural networks tend to over-fit to data providing poor performance. In this paper Gaussian processes are used as a method of function approximation to describe the action-value function of a non-isothermal semi-batch reactor. Through the use of analytical uncertainty obtained from Gaussian process predictions, trade off between exploration and exploitation is enabled, allowing for efficient generation of effective policies. Importantly Gaussian processes also enable probabilistic constraint violation to be modelled, ensuring safe constraint satisfaction throughout the learning procedure. On application to the in-silico case study, a safe, effective policy was generated utilising only 100 evaluations of process trajectory with no prior knowledge of the process dynamics. A result that would require significantly more trajectory evaluations when compared to a neural network based approach.
机译:最近对无模型加固学习进行了调查用于化学过程控制。通过迭代创建近似过程模型,可以探索控制操作和生成的最佳策略。通常,该近似过程模型采用连续更新的神经网络的形式。然而,当少量的历史数据可用时,例如在新颖的过程中,神经网络倾向于过度适合提供不良性能的数据。在本文中,高斯过程用作功能近似的方法,以描述非等温半批量反应器的动作值函数。通过使用从高斯过程预测获得的分析不确定性,勘探和开发之间的折衷是能够的,允许有效地产生有效的有效政策。重要的是高斯过程还能启用概率的约束违规,以确保整个学习程序的安全约束满足。在应用于硅基案例研究的应用中,利用100个对过程轨迹的100个评估产生了安全,有效的策略,没有对过程动态的先验知识。与基于神经网络的方法相比,需要显着更高的轨迹评估。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号