首页> 外文会议>American Control Conference >Safe Reinforcement Learning: Learning with Supervision Using a Constraint-Admissible Set

【24h】

Safe Reinforcement Learning: Learning with Supervision Using a Constraint-Admissible Set

机译：安全强化学习：使用约束可接受的套装学习监督

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Despite recent advances in Reinforcement Learning (RL), its applications in real-world engineering systems are still rare. The primary reason is that RL algorithms involve exploratory actions that can lead to system constraint violations. These violations can damage physical systems and even cause safety issues, e.g., battery overheat, robot breakdown, and car crashes, hindering RL deployment in many engineering applications. In this paper, we develop a novel safe RL framework that guarantees safety during learning by exploiting a constraint-admissible set for supervision. System knowledge and recursive feasibility techniques are exploited to construct a state-dependent constraint-admissible set. We develop a new learning scheme where the constraint-admissible set regulates the exploratory actions from the RL agent and simultaneously guides the agent to learn the system constraints with a penalty for control regulation. The proposed safe RL algorithm is demonstrated in an adaptive cruise control example where a nonlinear fuel economy cost function is optimized without violating system constraints. We demonstrate that the safe RL agent is able to learn the system constraints to gradually fade out the control supervisor.

机译：尽管近期加固学习（RL）进展，但其在现实世界工程系统中的应用仍然很少见。主要原因是RL算法涉及可能导致系统约束违规的探索性操作。这些违规行为可能会损害物理系统，甚至引起安全问题，例如，电池过热，机器人故障和汽车崩溃，在许多工程应用中阻碍了RL部署。在本文中，我们开发了一种新颖的安全RL框架，通过利用限制性的监督设定来保证学习期间的安全。利用系统知识和递归可行性技术来构建一个国家相关的约束允许集。我们开发了一个新的学习计划，其中约束允许设置从RL代理规范探索性行动，并同时指导代理人以学习对控制调节的罚款的系统限制。所提出的安全RL算法在自适应巡航控制示例中进行了说明，其中非线性燃料经济性成本函数经过优化而不违反系统约束。我们证明了安全的RL代理能够学习系统限制，以逐渐淡出控制主管。

著录项

来源
《American Control Conference》|2018年|720p|共6页
会议地点
作者
Zhaojian Li; Uros Kalabic; Tianshu Chu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类自动控制理论;
关键词

相似文献

外文文献
中文文献
专利

1. Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning [J] . Naoto Horie, Tohgoroh Matsui, Koichi Moriyama, Artificial life and robotics . 2019,第3期

机译：多目标安全强化学习：多目标强化学习与安全强化学习之间的关系
2. A Deep Learning Algorithm for the Max-Cut Problem Based on Pointer Network Structure with Supervised Learning and Reinforcement Learning Strategies [J] . Shenshen Gu, Yue Yang Mathematics . 2020,第2期

机译：一种深入学习算法，基于指针网络结构与监督学习和加固学习策略
3. Hybrid Reinforcement/Supervised Learning of Dialogue Policies from Fixed Data Sets [J] . Henderson J, Lemon O, Georgila K Computational linguistics . 2008,第4期

机译：从固定数据集混合强化/监督学习对话策略
4. Safe Reinforcement Learning: Learning with Supervision Using a Constraint-Admissible Set [C] . Zhaojian Li, Uros Kalabic, Tianshu Chu American Control Conference . 2018

机译：安全强化学习：使用约束可接受的套装学习监督
5. Understanding Model-Based Reinforcement Learning and its Application in Safe Reinforcement Learning [D] . Hu, Dingcheng . 2019

机译：了解基于模型的强化学习及其在安全强化学习中的应用
6. Weakly Supervised Reinforcement Learning for Autonomous Highway Driving via Virtual Safety Cages [O] . Sampo Kuutti, Richard Bowden, Saber Fallah 2021

机译：通过虚拟安全笼驾驶自主公路的弱势救济学习
7. A Comparison Of Supervised And Reinforcement Learning Methods On A Reinforcement Learning Task [O] . Vijaykumar Gullapalli 1992

机译：强化学习任务中监督学习和强化学习方法的比较

Safe Reinforcement Learning: Learning with Supervision Using a Constraint-Admissible Set

摘要

著录项

相似文献

相关主题

期刊订阅