A General Framework to Increase Safety of Learning Algorithms for Dynamical Systems Based on Region of Attraction Estimation

Zhou Zhehua; Oguz Ozgur S.; Leibold Marion; Buss Martin

首页> 外文期刊>IEEE Transactions on Robotics >A General Framework to Increase Safety of Learning Algorithms for Dynamical Systems Based on Region of Attraction Estimation

【24h】

A General Framework to Increase Safety of Learning Algorithms for Dynamical Systems Based on Region of Attraction Estimation

机译：基于吸引区估计区域提高动态系统学习算法安全的一般框架

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Although the state-of-the-art learning approaches exhibit impressive results for dynamical systems, only a few applications on real physical systems have been presented. One major impediment is that the intermediate policy during the training procedure may result in behaviors that are not only harmful to the system itself but also to the environment. In essence, imposing safety guarantees for learning algorithms is vital for autonomous systems acting in the real world. In this article, we propose a computationally effective and general safe learning framework, specifically for complex dynamical systems. With a proper definition of the safe region, a supervisory control strategy, which switches the actions applied on the system between the learning-based controller and a predefined corrective controller, is given. A simplified system facilitates the estimation of the safe region for the high-dimensional dynamical system. During the learning phase, the belief of the safe region is updated with the actual execution results of the corrective controller, which in turn enables the learning-based controller to have more freedom in choosing its actions. Two examples are given to demonstrate the performance of the proposed framework, one simple inverted pendulum to illustrate the online adaptation method, and one quadcopter control task to show the overall performance.

机译：虽然最先进的学习方法表现出令人印象深刻的动态系统的结果，但仅介绍了真实物理系统上的一些应用。一个主要障碍是培训程序期间的中间政策可能导致行为不仅对系统本身有害，而且是对环境的危害。从本质上讲，对学习算法施加安全保障对于在现实世界中的自治系统来说至关重要。在本文中，我们提出了一种计算有效和一般安全的学习框架，专门针对复杂的动态系统。通过适当的安全区域定义，给出了一种监控策略，它在基于学习的控制器和预定义的校正控制器之间切换应用于系统之间的动作。简化的系统有助于估计高维动力系统的安全区域。在学习阶段，安全区域的信仰是使用纠正控制器的实际执行结果进行更新，这又使基于学习的控制器能够在选择其动作方面具有更多自由度。给出了两个示例来证明所提出的框架的性能，一个简单的反相摆动来说明在线适应方法，以及一个Quadcopter控制任务以显示整体性能。

著录项

来源
《IEEE Transactions on Robotics》 |2020年第5期|1472-1490|共19页
作者
Zhou Zhehua; Oguz Ozgur S.; Leibold Marion; Buss Martin;
展开▼
作者单位

Tech Univ Munich Chair Automat Control Engn D-80290 Munich Germany;

Tech Univ Munich Chair Automat Control Engn D-80290 Munich Germany;

Tech Univ Munich Chair Automat Control Engn D-80290 Munich Germany;

Tech Univ Munich Chair Automat Control Engn D-80290 Munich Germany;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Safety; Dynamical systems; Heuristic algorithms; Reinforcement learning; Control systems; Task analysis; Estimation; Deep learning in robotics and automation; learning and adaptive systems; robot safety; safe reinforcement learning;

机译：安全;动态系统;启发式算法;加强学习;控制系统;任务分析;估计;在机器人和自动化中深入学习;学习和自适应系统;机器人安全;安全强化学习;

相似文献

外文文献
中文文献
专利

1. Use case repository framework based on machine learning algorithm to analyze the software development estimation with intelligent information systems [J] . Lalitha R., Latha B., Sumathi G. International Journal of Wavelets, Multiresolution and Information Processing . 2020,第1期

机译：使用基于机器学习算法的案例存储库框架来分析智能信息系统的软件开发估计
2. A novel classification learning framework based on estimation of distribution algorithms [J] . Jiancong Fan, Qiang Xu, Yongquan Liang International journal of computing science and mathematics . 2012,第4期

机译：基于分布算法估计的新型分类学习框架
3. LEARNING ALGORITHMS FOR A CLASS OF KNOWLEDGE-BASED SYSTEMS WITH DYNAMICAL KNOWLEDGE REPRESENTATION [J] . Zdzislaw Bubnicki Systems Science . 2000,第1期

机译：一类具有动态知识表示的基于知识的系统的学习算法
4. Controller design and region of attraction estimation for nonlinear dynamical systems [C] . Milan Korda, Didier Henrion, Colin N. Jones IFAC World Congress . 2014

机译：非线性动力系统的控制器设计与景点估计
5. Towards Building Autonomy and Intelligence for Surgical Robotic Systems Using Trajectory Optimization, Stochastic Estimation, Vision-Based Control, and Machine Learning Algorithms [D] . Aghajani Pedram, Sahba. 2020

机译：利用轨迹优化，随机估计，基于视觉控制和机器学习算法建立外科机器人系统的自主权和智能
6. Learning Linear Dynamical Systems from Multivariate Time Series: A Matrix Factorization Based Framework [O] . Zitao Liu, Milos Hauskrecht -1

机译：从多元时间序列学习线性动力系统：基于矩阵分解的框架
7. Controller design and region of attraction estimation for nonlinear dynamical systems [O] . Milan Korda, Didier Henrion, Colin N. Jones 2014

机译：非线性动力系统的控制器设计和吸引力估计区域

A General Framework to Increase Safety of Learning Algorithms for Dynamical Systems Based on Region of Attraction Estimation

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅