...
首页> 外文期刊>IFAC PapersOnLine >Constrained Q-Learning for Batch Process Optimization
【24h】

Constrained Q-Learning for Batch Process Optimization

机译:受限制的Q学习进行批处理优化

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Chemical process optimization and control often require satisfaction of constraints for safe operation. Reinforcement learning (RL) has been shown to be a powerful control technique that can handle nonlinear stochastic optimal control problems. Despite this promise, RL has yet to see significant translation to industrial practice due to its inability to satisfy state constraints. This work aims to address this challenge. We propose an “oracle”-assisted constrained Q-learning algorithm that guarantees the satisfaction of joint chance constraints with high probability, which is required for safety critical tasks. To that end, constraint tightening (backoffs) are introduced, which are adjusted using Broyden’s method, hence making the backoffs self-tuned. This results in a general methodology that can be integrated into approximate dynamic programming-based algorithms to guarantee constraint satisfaction with high probability. Finally, a case study is presented to compare the performance of the proposed approach with that of model predictive control (MPC). The superior performance of the proposed algorithm, in terms of constraint handling, signifies a step toward the use of RL in real world optimization and control of systems, where constraints are critical in ensuring safety.
机译:化学过程优化和控制通常需要满足安全操作的约束。钢筋学习(RL)已被证明是一种能够处理非线性随机最佳控制问题的强大控制技术。尽管这一承诺,由于无法满足国家限制,但RL尚未对工业实践进行重大翻译。这项工作旨在解决这一挑战。我们提出了一个“Oracle”的“甲骨文” - 自由度约束Q学习算法,可确保对高概率的关节机会限制满意,这是安全关键任务所必需的。为此,介绍了约束紧固(退避),使用Broyden的方法调整,因此使退避自调整。这导致一般方法可以集成到基于近似的动态编程的算法中,以确保对高概率的约束满足。最后,提出了一个案例研究以比较模型预测控制(MPC)的提出方法的性能。在约束处理方面,所提出的算法的卓越性能意味着朝着使用RL在现实世界优化和控制中使用的步骤,其中约束对于确保安全性是至关重要的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号