...
首页> 外文期刊>IEEE transactions on industrial informatics >Reinforcement Learning Based Decision Making of Operational Indices in Process Industry Under Changing Environment
【24h】

Reinforcement Learning Based Decision Making of Operational Indices in Process Industry Under Changing Environment

机译:改变环境下工艺行业运营指标的加固学习决策

获取原文
获取原文并翻译 | 示例
           

摘要

The plant-wide production process is composed of multiple unit processes, in which the operational indices of each unit process are assigned and adjusted according to product quality, yield, and actual operating modes. Due to the changing operational conditions of the production process, the operational indices cannot be effectively adjusted by most of the model-based methods or evolutionary computation. In this article, the decision making of operational indices is formulated as a continuous state, continuous action reinforcement learning (RL) problem and a model-free RL algorithm is proposed, which learns a decision policy to determine the operational indices according to the actual operational conditions. Different from the existing methods, this article presents a multiactor networks ensemble algorithm and an actor-critic framework with stochastic policy to avoid falling into local optimums. The relatively overall optimal policy is obtained by extracting the results of parallel training of multiactor networks, which guarantees the optimality of the obtained policy. In addition, by using the experience replay, it is particularly valuable to effectively deal with the problem that lacking of sampling data in the model-free RL. Simulation studies are conducted on actual data of a mineral processing plant and the results demonstrate the effectiveness of the proposed algorithm.
机译:省级的生产过程由多个单元过程组成,其中每个单元过程的操作指标根据产品质量,产量和实际操作模式分配和调整。由于生产过程的运行条件不变,大多数基于模型的方法或进化计算无法有效地调整操作指标。在本文中,提出了连续的行动索引的决策制定作为连续的状态,连续动作强化学习(RL)问题和无模型的RL算法,其学习根据实际操作确定操作指数的决策策略使适应。与现有方法不同,本文介绍了一个多视科网络集合算法和带有随机策略的演员 - 评论家框架,以避免陷入本地最佳策略。通过提取多功能网络的并行培训结果来获得相对整体的最佳策略,这保证了所获得的政策的最优性。此外,通过使用重播体验,可以有效地处理缺少无模型RL中的采样数据的问题特别有价值。仿真研究是在矿物加工厂的实际数据上进行的,结果证明了所提出的算法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号