Multi-step Greedy Reinforcement Learning Based on Model Predictive Control

Yucheng Yang; Sergio Lucia

首页> 外文期刊>IFAC PapersOnLine >Multi-step Greedy Reinforcement Learning Based on Model Predictive Control

【24h】

Multi-step Greedy Reinforcement Learning Based on Model Predictive Control

机译：基于模型预测控制的多步贪婪钢筋学习

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Reinforcement learning aims to compute optimal control policies with the help of data from closed-loop trajectories. Traditional model-free approaches need huge number of data points to achieve an acceptable performance, rendering them not applicable in most real situations, even if the data can be obtained from a detailed simulator. Model-based reinforcement learning approaches try to leverage model knowledge to drastically reduce the amount of data needed or to enforce important constraints to the closed-loop operation, which is another important drawback of model-free approaches. This paper proposes a novel model-based reinforcement learning approach. The main novelty is the fact that we exploit all the information of a model predictive control (MPC) computing step, and not only the first input that is actually applied to the plant, to efficiently learn a good approximation of the state value function. This approximation can be included into a model predictive control formulation as a terminal cost with a short prediction horizon, achieving a similar performance to an MPC with a very long prediction horizon. Simulation results of a discretized batch bioreactor illustrate the potential of the proposed methodology.

机译：强化学习旨在在闭环轨迹的数据的帮助下计算最佳控制策略。传统的无模式方法需要大量的数据点来实现可接受的性能，即使数据可以从详细的模拟器获得数据，也可以在最实际情况下呈现不适用。基于模型的强化学习方法尝试利用模型知识来大大减少所需的数据量或强制对闭环操作来强制执行重要的限制，这是无模型方法的另一个重要缺点。本文提出了一种基于模型的钢筋学习方法。主要的新颖性是我们利用模型预测控制（MPC）计算步骤的所有信息，并且不仅实际应用于工厂的第一输入，以有效地学习状态值函数的良好近似。该近似可以包括在模型预测控制制剂中作为具有短预测地平线的终端成本，以具有非常长的预测地平线的MPC实现类似的性能。离散批量生物反应器的仿真结果说明了所提出的方法的潜力。

著录项

来源
《IFAC PapersOnLine》 |2021年第3期|共7页
作者
Yucheng Yang; Sergio Lucia;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词
Reinforcement learningModel predictive controlNonlinear systemBatch reactor;

机译：钢筋学习模型预测控制诺斯·反应器;

相似文献

外文文献
中文文献
专利

1. Multi-step reinforcement learning for model-free predictive energy management of an electrified off-highway vehicle [J] . Zhou Quan, Li Ji, Shuai Bin, Applied Energy . 2019,第Deca1期

机译：多步骤强化学习，用于电动非公路车辆的无模型预测能源管理
2. Autonomous boat driving system using sample-efficient model predictive control-based reinforcement learning approach [J] . Yunduan Cui, Shigeki Osaki, Takamitsu Matsubara Journal of Field Robotics . 2021,第3期

机译：自动船舶驱动系统采用采样高效的模型预测控制的加固学习方法
3. Stochastic model predictive control for energy management of power-split plug-in hybrid electric vehicles based on reinforcement learning [J] . Zheng Chen, Hengjie Hu, Yitao Wu, Energy . 2020,第Pta1期

机译：基于加固学习的电力分配插入式混合动力电动汽车能源管理随机模型预测控制
4. Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies [C] . Yonathan Efroni, Nadav Merlis, Mohammad Ghavamzadeh, Conference on Neural Information Processing Systems . 2020

机译：基于模型的强化学习与贪婪政策的紧张遗憾界限
5. Robotic Swarm Control Using Deep Reinforcement Learning Strategies Based on Mean-Field Models [D] . Kakish, Zahi. 2021

机译：基于平均场模型的深增强学习策略，机器人群控制
6. Cognitive Control Predicts Use of Model-Based Reinforcement-Learning [O] . A. Ross Otto, Anya Skatova, Seth Madlon-Kay, -1

机译：认知控制预测使用基于模型的强化学习
7. Multi-step reinforcement learning for model-free predictive energy management of an electrified off-highway vehicle [O] . Quan Zhou, Ji Li, Bin Shuai, 2019

机译：用于电气脱气车辆的无模型预测能量管理的多步强化学习

Multi-step Greedy Reinforcement Learning Based on Model Predictive Control

摘要

著录项

相似文献

相关主题

期刊订阅