...
首页> 外文期刊>Applied Energy >Model-predictive control and reinforcement learning in multi-energy system case studies
【24h】

Model-predictive control and reinforcement learning in multi-energy system case studies

机译:多能量系统案例研究中的模型预测控制和增强学习

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Model predictive control (MPC) offers an optimal control technique to establish and ensure that the total operation cost of multi-energy systems remains at a minimum while fulfilling all system constraints. However, this method presumes an adequate model of the underlying system dynamics, which is prone to modelling errors and is not necessarily adaptive. This has an associated initial and ongoing project-specific engineering cost. In this paper, we present an on-and off-policy multi-objective reinforcement learning (RL) approach that does not assume a model a priori, benchmarking this against a linear MPC (LMPC - to reflect current practice, though non-linear MPC performs better) both derived from the general optimal control problem, highlighting their differences and similarities. In a simple multi-energy system (MES) configuration case study, we show that a twin delayed deep deterministic policy gradient (TD3) RL agent offers the potential to match and outperform the perfect foresight LMPC benchmark (101.5%). This while the realistic LMPC, i.e. imperfect predictions, only achieves 98%. While in a more complex MES system configuration, the RL agent's performance is generally lower (94.6%), yet still better than the realistic LMPC (88.9%). In both case studies, the RL agents outperformed the realistic LMPC after a training period of 2 years using quarterly interactions with the environment. We conclude that reinforcement learning is a viable optimal control technique for multi-energy systems given adequate constraint handling and pre-training, to avoid unsafe interactions and long training periods, as is proposed in fundamental future work.
机译:模型预测控制(MPC)提供了最佳的控制技术,可以建立和确保多能量系统的总操作成本在满足所有系统约束时保持最小。然而,该方法推测了底层系统动态的适当模型,其容易造型错误,并且不一定是自适应的。这具有相关的初始和持续的项目特定工程成本。在本文中,我们介绍了一个开除的多目标多目标强力学习(RL)方法,不承担模型a先验,这反对线性MPC(LMPC - 以反映当前的实践,尽管非线性MPC表现得更好)来自一般最优控制问题,突出显示它们的差异和相似之处。在一个简单的多能量系统(MES)配置案例研究中,我们表明双延迟深度确定性政策梯度(TD3)RL代理提供了匹配和优于完美的远见LMPC基准(101.5%)的可能性。这是现实的LMPC,即不完美的预测,只能实现98%。虽然在更复杂的MES系统配置中,RL代理的性能通常较低(94.6%),但仍然比现实的LMPC(88.9%)更好。在这两种情况下,使用季度与环境互动2年后,RL代理商在2年后的现实LMPC。我们得出结论,增强学习是一种可行的多能量系统的可行最佳控制技术,因为提供了足够的约束处理和预训练,以避免不安全的相互作用和长期训练期,就像在基本的未来工作中所提出的那样。

著录项

  • 来源
    《Applied Energy》 |2021年第1期|117634.1-117634.12|共12页
  • 作者单位

    ABB Hoge Wei 27 B-1930 Zaventem Belgium|Vrije Univ Brussel VUB ETEC MOBI Pl Laan 2 B-1050 Brussels Belgium|Vrije Univ Brussel VUB AI Lab Pl Laan 2 B-1050 Brussels Belgium;

    Katholieke Univ Leuven ESAT ELECTA Kasteelpk Arenberg 10 B-3001 Leuven Belgium|EnergyVille B-3600 Genk Belgium;

    Katholieke Univ Leuven ESAT ELECTA Kasteelpk Arenberg 10 B-3001 Leuven Belgium;

    ABB Hoge Wei 27 B-1930 Zaventem Belgium;

    Katholieke Univ Leuven ESAT ELECTA Kasteelpk Arenberg 10 B-3001 Leuven Belgium|EnergyVille B-3600 Genk Belgium;

    Katholieke Univ Leuven Dept Mech Engn TME Celestijnenlaan 300 B-3001 Leuven Belgium|EnergyVille B-3600 Genk Belgium;

    Vrije Univ Brussel VUB AI Lab Pl Laan 2 B-1050 Brussels Belgium;

    Vrije Univ Brussel VUB ETEC MOBI Pl Laan 2 B-1050 Brussels Belgium;

    Vrije Univ Brussel VUB ETEC MOBI Pl Laan 2 B-1050 Brussels Belgium;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Model-predictive control; Reinforcement learning; Optimal control; Multi-energy systems;

    机译:模型预测控制;钢筋学习;最优控制;多能量系统;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号