Reinforcement Learning and Feedback Control: Using Natural Decision Methods to Design Optimal Adaptive Controllers

Lewis F.L.; Vrabie D.; Vamvoudakis K.G.

首页> 外文期刊>Control Systems, IEEE >Reinforcement Learning and Feedback Control: Using Natural Decision Methods to Design Optimal Adaptive Controllers

【24h】

Reinforcement Learning and Feedback Control: Using Natural Decision Methods to Design Optimal Adaptive Controllers

机译：强化学习和反馈控制：使用自然决策方法设计最佳自适应控制器

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This article describes the use of principles of reinforcement learning to design feedback controllers for discrete- and continuous-time dynamical systems that combine features of adaptive control and optimal control. Adaptive control [1], [2] and optimal control [3] represent different philosophies for designing feedback controllers. Optimal controllers are normally designed of ine by solving Hamilton JacobiBellman (HJB) equations, for example, the Riccati equation, using complete knowledge of the system dynamics. Determining optimal control policies for nonlinear systems requires the offline solution of nonlinear HJB equations, which are often difficult or impossible to solve. By contrast, adaptive controllers learn online to control unknown systems using data measured in real time along the system trajectories. Adaptive controllers are not usually designed to be optimal in the sense of minimizing user-prescribed performance functions. Indirect adaptive controllers use system identification techniques to first identify the system parameters and then use the obtained model to solve optimal design equations [1]. Adaptive controllers may satisfy certain inverse optimality conditions [4].

机译：本文介绍了如何使用强化学习原理为离散和连续时间动力系统设计反馈控制器，该控制器结合了自适应控制和最优控制的功能。自适应控制[1]，[2]和最佳控制[3]代表了设计反馈控制器的不同理念。通常，通过使用完整的系统动力学知识来求解Hamilton JacobiBellman（HJB）方程（例如Riccati方程）来设计最优控制器。确定非线性系统的最优控制策略需要非线性HJB方程的离线解，这通常很难或不可能解决。相比之下，自适应控制器会在线学习以使用沿系统轨迹实时测量的数据来控制未知系统。从最小化用户指定的性能功能的角度来看，自适应控制器通常设计得不是最佳的。间接自适应控制器使用系统识别技术首先识别系统参数，然后使用获得的模型来求解最佳设计方程[1]。自适应控制器可以满足某些逆最优条件[4]。

著录项

来源
《Control Systems, IEEE》 |2012年第6期|76-105|共30页
作者
Lewis F.L.; Vrabie D.; Vamvoudakis K.G.;
展开▼
作者单位

Autom. & Robot. Res. Inst., Univ. of Texas at Arlington, Worth, TX, USA|c|;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
入库时间 2022-08-18 01:17:00

相似文献

外文文献
中文文献
专利

1. Adaptive optimal output feedback tracking control for unknown discrete-time linear systems using a combined reinforcement Q-learning and internal model method [J] . Control Theory & Applications, IET . 2019,第18期

机译：结合强化Q学习和内部模型方法的未知离散时间线性系统的自适应最优输出反馈跟踪控制
2. Automated Design of Adaptive Controllers for Modular Robots using Reinforcement Learning [J] . Paulina Varshavskaya, Leslie Pack Kaelbling, Daniela Rus The International journal of robotics research . 2008,第3a4期

机译：基于强化学习的模块化机器人自适应控制器自动化设计
3. Multiagent Reinforcement Learning for Integrated Network of Adaptive Traffic Signal Controllers (MARLIN-ATSC): Methodology and Large-Scale Application on Downtown Toronto [J] . El-Tantawy, S., Abdulhai, IEEE Transactions on Intelligent Transportation Systems . 2013,第3期

机译：自适应交通信号控制器集成网络（MARLIN-ATSC）的多主体强化学习：方法和在多伦多市中心的大规模应用
4. Optimal adaptive control for unknown systems using output feedback by reinforcement learning methods [C] . Lewis F.L., Vamvoudakis Kyriakos G. Proceedings of the 8th IEEE International Conference on Control and Automation . 2010

机译：强化学习方法利用输出反馈对未知系统进行最优自适应控制
5. Dynamic tuning of PI-controllers based on model-free Reinforcement Learning methods. [D] . Abbasi Brujeni, Lena. 2010

机译：基于无模型强化学习方法的PI控制器的动态调整。
6. Ultrasensitive Negative Feedback Control: A Natural Approach for the Design of Synthetic Controllers [O] . Francesco Montefusco, Ozgur E. Akman, Orkun S. Soyer, 2011

机译：超灵敏的负反馈控制：设计合成控制器的自然方法
7. Volume 2, Issue 3, Special issue on Recent Advances in Engineering Systems (Published Papers) Articles Transmit / Received Beamforming for Frequency Diverse Array with Symmetrical frequency offsets Shaddrack Yaw Nusenu Adv. Sci. Technol. Eng. Syst. J. 2(3), 1-6 (2017); View Description Detailed Analysis of Amplitude and Slope Diffraction Coefficients for knife-edge structure in S-UTD-CH Model Eray Arik, Mehmet Baris Tabakcioglu Adv. Sci. Technol. Eng. Syst. J. 2(3), 7-11 (2017); View Description Applications of Case Based Organizational Memory Supported by the PAbMM Architecture Martín, María de los Ángeles, Diván, Mario José Adv. Sci. Technol. Eng. Syst. J. 2(3), 12-23 (2017); View Description Low Probability of Interception Beampattern Using Frequency Diverse Array Antenna Shaddrack Yaw Nusenu Adv. Sci. Technol. Eng. Syst. J. 2(3), 24-29 (2017); View Description Zero Trust Cloud Networks using Transport Access Control and High Availability Optical Bypass Switching Casimer DeCusatis, Piradon Liengtiraphan, Anthony Sager Adv. Sci. Technol. Eng. Syst. J. 2(3), 30-35 (2017); View Description A Derived Metrics as a Measurement to Support Efficient Requirements Analysis and Release Management Indranil Nath Adv. Sci. Technol. Eng. Syst. J. 2(3), 36-40 (2017); View Description Feedback device of temperature sensation for a myoelectric prosthetic hand Yuki Ueda, Chiharu Ishii Adv. Sci. Technol. Eng. Syst. J. 2(3), 41-40 (2017); View Description Deep venous thrombus characterization: ultrasonography, elastography and scattering operator Thibaud Berthomier, Ali Mansour, Luc Bressollette, Frédéric Le Roy, Dominique Mottier Adv. Sci. Technol. Eng. Syst. J. 2(3), 48-59 (2017); View Description Improving customs’ border control by creating a reference database of cargo inspection X-ray images Selina Kolokytha, Alexander Flisch, Thomas Lüthi, Mathieu Plamondon, Adrian Schwaninger, Wicher Vasser, Diana Hardmeier, Marius Costin, Caroline Vienne, Frank Sukowski, Ulf Hassler, Irène Dorion, Najib Gadi, Serge Maitrejean, Abraham Marciano, Andrea Canonica, Eric Rochat, Ger Koomen, Micha Slegt Adv. Sci. Technol. Eng. Syst. J. 2(3), 60-66 (2017); View Description Aviation Navigation with Use of Polarimetric Technologies Arsen Klochan, Ali Al-Ammouri, Viktor Romanenko, Vladimir Tronko Adv. Sci. Technol. Eng. Syst. J. 2(3), 67-72 (2017); View Description Optimization of Multi-standard Transmitter Architecture Using Single-Double Conversion Technique Used for Rescue Operations Riadh Essaadali, Said Aliouane, Chokri Jebali and Ammar Kouki Adv. Sci. Technol. Eng. Syst. J. 2(3), 73-81 (2017); View Description Singular Integral Equations in Electromagnetic Waves Reflection Modeling A. S. Ilinskiy, T. N. Galishnikova Adv. Sci. Technol. Eng. Syst. J. 2(3), 82-87 (2017); View Description Methodology for Management of Information Security in Industrial Control Systems: A Proof of Concept aligned with Enterprise Objectives. Fabian Bustamante, Walter Fuertes, Paul Diaz, Theofilos Toulqueridis Adv. Sci. Technol. Eng. Syst. J. 2(3), 88-99 (2017); View Description Dependence-Based Segmentation Approach for Detecting Morpheme Boundaries Ahmed Khorsi, Abeer Alsheddi Adv. Sci. Technol. Eng. Syst. J. 2(3), 100-110 (2017); View Description Paper Improving Rule Based Stemmers to Solve Some Special Cases of Arabic Language Soufiane Farrah, Hanane El Manssouri, Ziyati Elhoussaine, Mohamed Ouzzif Adv. Sci. Technol. Eng. Syst. J. 2(3), 111-115 (2017); View Description Medical imbalanced data classification Sara Belarouci, Mohammed Amine Chikh Adv. Sci. Technol. Eng. Syst. J. 2(3), 116-124 (2017); View Description ADOxx Modelling Method Conceptualization Environment Nesat Efendioglu, Robert Woitsch, Wilfrid Utz, Damiano Falcioni Adv. Sci. Technol. Eng. Syst. J. 2(3), 125-136 (2017); View Description GPSR+Predict: An Enhancement for GPSR to Make Smart Routing Decision by Anticipating Movement of Vehicles in VANETs Zineb Squalli Houssaini, Imane Zaimi, Mohammed Oumsis, Saïd El Alaoui Ouatik Adv. Sci. Technol. Eng. Syst. J. 2(3), 137-146 (2017); View Description Optimal Synthesis of Universal Space Vector Digital Algorithm for Matrix Converters Adrian Popovici, Mircea Băbăiţă, Petru Papazian Adv. Sci. Technol. Eng. Syst. J. 2(3), 147-152 (2017); View Description Control design for axial flux permanent magnet synchronous motor which operates above the nominal speed Xuan Minh Tran, Nhu Hien Nguyen, Quoc Tuan Duong Adv. Sci. Technol. Eng. Syst. J. 2(3), 153-159 (2017); View Description A synchronizing second order sliding mode control applied to decentralized time delayed multi−agent robotic systems: Stability Proof Marwa Fathallah, Fatma Abdelhedi, Nabil Derbel Adv. Sci. Technol. Eng. Syst. J. 2(3), 160-170 (2017); View Description Fault Diagnosis and Tolerant Control Using Observer Banks Applied to Continuous Stirred Tank Reactor Martin F. Pico, Eduardo J. Adam Adv. Sci. Technol. Eng. Syst. J. 2(3), 171-181 (2017); View Description Development and Validation of a Heat Pump System Model Using Artificial Neural Network Nabil Nassif, Jordan Gooden Adv. Sci. Technol. Eng. Syst. J. 2(3), 182-185 (2017); View Description Assessment of the usefulness and appeal of stigma-stop by psychology students: a serious game designed to reduce the stigma of mental illness Adolfo J. Cangas, Noelia Navarro, Juan J. Ojeda, Diego Cangas, Jose A. Piedra, José Gallego Adv. Sci. Technol. Eng. Syst. J. 2(3), 186-190 (2017); View Description Kinect-Based Moving Human Tracking System with Obstacle Avoidance Abdel Mehsen Ahmad, Zouhair Bazzal, Hiba Al Youssef Adv. Sci. Technol. Eng. Syst. J. 2(3), 191-197 (2017); View Description A security approach based on honeypots: Protecting Online Social network from malicious profiles Fatna Elmendili, Nisrine Maqran, Younes El Bouzekri El Idrissi, Habiba Chaoui Adv. Sci. Technol. Eng. Syst. J. 2(3), 198-204 (2017); View Description Pulse Generator for Ultrasonic Piezoelectric Transducer Arrays Based on a Programmable System-on-Chip (PSoC) Pedro Acevedo, Martín Fuentes, Joel Durán, Mónica Vázquez, Carlos Díaz Adv. Sci. Technol. Eng. Syst. J. 2(3), 205-209 (2017); View Description Enabling Toy Vehicles Interaction With Visible Light Communication (VLC) M. A. Ilyas, M. B. Othman, S. M. Shah, Mas Fawzi Adv. Sci. Technol. Eng. Syst. J. 2(3), 210-216 (2017); View Description Analysis of Fractional-Order 2xn RLC Networks by Transmission Matrices Mahmut Ün, Manolya Ün Adv. Sci. Technol. Eng. Syst. J. 2(3), 217-220 (2017); View Description Fire extinguishing system in large underground garages Ivan Antonov, Rositsa Velichkova, Svetlin Antonov, Kamen Grozdanov, Milka Uzunova, Ikram El Abbassi Adv. Sci. Technol. Eng. Syst. J. 2(3), 221-226 (2017); View Description Directional Antenna Modulation Technique using A Two-Element Frequency Diverse Array Shaddrack Yaw Nusenu Adv. Sci. Technol. Eng. Syst. J. 2(3), 227-232 (2017); View Description Classifying region of interests from mammograms with breast cancer into BIRADS using Artificial Neural Networks Estefanía D. Avalos-Rivera, Alberto de J. Pastrana-Palma Adv. Sci. Technol. Eng. Syst. J. 2(3), 233-240 (2017); View Description Magnetically Levitated and Guided Systems Florian Puci, Miroslav Husak Adv. Sci. Technol. Eng. Syst. J. 2(3), 241-244 (2017); View Description Energy-Efficient Mobile Sensing in Distributed Multi-Agent Sensor Networks Minh T. Nguyen Adv. Sci. Technol. Eng. Syst. J. 2(3), 245-253 (2017); View Description Validity and efficiency of conformal anomaly detection on big distributed data Ilia Nouretdinov Adv. Sci. Technol. Eng. Syst. J. 2(3), 254-267 (2017); View Description S-Parameters Optimization in both Segmented and Unsegmented Insulated TSV upto 40GHz Frequency Juma Mary Atieno, Xuliang Zhang, HE Song Bai Adv. Sci. Technol. Eng. Syst. J. 2(3), 268-276 (2017); View Description Synthesis of Important Design Criteria for Future Vehicle Electric System Lisa Braun, Eric Sax Adv. Sci. Technol. Eng. Syst. J. 2(3), 277-283 (2017); View Description Gestural Interaction for Virtual Reality Environments through Data Gloves G. Rodriguez, N. Jofre, Y. Alvarado, J. Fernández, R. Guerrero Adv. Sci. Technol. Eng. Syst. J. 2(3), 284-290 (2017); View Description Solving the Capacitated Network Design Problem in Two Steps [O] . Meriem Khelifi, Mohand Yazid Saidi, Saadi Boudjit 2017

机译：第2卷，第3卷，工程系统最近进步的特殊问题（已发布论文）文章传输/接收频率各种阵列的波束成形，具有对称频率偏移Shaddrack偏航Nusenu Adv。 SCI。技术。 eng。系统。 J. 2（3），1-6（2017）;查看描述S-UTD-CH模型Eray Arik刀刃结构幅度和坡度衍射系数的详细分析，Mehmet Baris Tabakcioglu Adv。 SCI。技术。 eng。系统。 J. 2（3），7-11（2017）;查看描述案例基于组织内存的案例组织内存由PABMM ArchitectralMartín，MaríadeLosÁngeles，Diván，MarioJoséAven。 SCI。技术。 eng。系统。 J. 2（3），12-23（2017）;查看说明使用频率各种阵列天线Shaddrack偏航Nusenu Adv的低拦截横梁仪表概率。 SCI。技术。 eng。系统。 J. 2（3），24-29（2017）;查看说明零信任云网络使用传输访问控制和高可用性光学旁路交换套管切换西米列德·莱格托希金，安东尼Sager adv。 SCI。技术。 eng。系统。 J. 2（3），30-35（2017）;视图描述派生指标作为支持有效的需求分析和发布管理Indranil Nath ADV的测量。 SCI。技术。 eng。系统。 J. 2（3），36-40（2017）;视图描述肌电假肢yuki ueda的温度感觉反馈装置，恰米·伊莎。 SCI。技术。 eng。系统。 J. 2（3），41-40（2017）;查看描述深静脉血栓表征：超声检查，弹性造影和散射操作员Thibaud Berthomier，Ali Mansour，Luc Bressollette，FrédéricLeRoy，Dominique Mottier Adv。 SCI。技术。 eng。系统。 J. 2（3），48-59（2017）;查看说明通过创建货物检测的参考数据库来改进海关边界控制X射线图像Selina Kolokytha，Alexander Flisch，ThomasLüthi，Mathieu Plamondon，Adrian Schwaninger，Wiana Schwaninger，Wiana Hardmeier，Marius Costin，Caroline Vienne，Frank Sukowski，ULF哈桑德勒，伊瑞恩多森，纳吉·甘迪，塞尔格·马西亚诺，亚伯拉·马西亚诺，安德雷阿索尼卡，埃里克·罗·克，Ger Komen，Micha Slegt Adv。 SCI。技术。 eng。系统。 J. 2（3），60-66（2017）;查看说明航空导航使用偏光技术Arsen Klochan，Ali Al-Ammouri，Viktor Romanenko，Vladimir Tronko Adv。 SCI。技术。 eng。系统。 J. 2（3），67-72（2017）;查看描述使用用于救援运营的单双转换技术优化多标准变送器架构Riadue Essaadali，Chokri Jebali和Ammar Kouki Adv。 SCI。技术。 eng。系统。 J. 2（3），73-81（2017）;视图描述电磁波反射模型中的奇异积分方程A. S.Ilinskiy，T.Galishnikova Adv。 SCI。技术。 eng。系统。 J. 2（3），82-87（2017）;查看工业控制系统信息安全管理的描述方法：概念证明与企业目标对齐。 Fabian Bustamante，Walter Fuertes，Paul Diaz，Theofilos Toulqueridis adv。 SCI。技术。 eng。系统。 J. 2（3），88-99（2017年）;查看描述依赖基于依赖的分割方法，用于检测语素边界Ahmed Khorsi，Abeer Alsheddi Adv。 SCI。技术。 eng。系统。 J. 2（3），100-110（2017）;查看描述纸张改进了基于统治的犹太人，解决了阿拉伯语Soufiane Farrah，Hanane El Manssouri，Ziyati Elhoussaine，Mohamed Ouzzif Adv。 SCI。技术。 eng。系统。 J. 2（3），111-115（2017）;查看描述医疗不平衡数据分类Sara Belarouci，穆罕默德胺Chikh Adv。 SCI。技术。 eng。系统。 J. 2（3），116-124（2017）;查看描述adoxx建模方法概念化环境Nesat Efendioglu，Robert Woitsch，Wilfrid Utz，Damiano Falcioni Adv。 SCI。技术。 eng。系统。 J. 2（3），125-136（2017）;查看描述GPSR +预测：通过预期Vanets Zineb Squalli Houssaini，Imane Zaimi，Mohammed Oumsis，SaïdelAlaouiOuatik Advik Advik Advik Advik Advik Acik Adve，GPSR +预测SCI。技术。 eng。系统。 J.2（3），137-146（2017）;查看说明矩阵转换器通用空间矢量数字算法的最佳合成Adrian Popovici，MirceaBăBăIţă，Petru Papazian adv。 SCI。技术。 eng。系统。 J. 2（3），147-152（2017）;视图描述轴向磁通永磁同步电动机的控制设计，其在标称旋转Xuan Minh Tran，Nhu Hien Nguyen，CACoc Tuan Duong Adv。 SCI。技术。 eng。系统。 J. 2（3），153-159（2017）;视图说明A同步应用于分散时间延迟多功能机器人系统：稳定性证明Marwa Fathallah，Fatma Abdelhedi，Nabil Derbel Adv。 SCI。技术。 eng。系统。 J. 2（3），160-170（2017年）;查看描述故障诊断和耐受控制使用观察者银行应用于连续搅拌坦克反应器Martin F. Pico，Eduardo J. Adam Adv。 SCI。技术。 eng。系统。 J. 2（3），171-181（2017年）;查看说明用人工神经网络利用人工神经网络的热泵系统模型的开发和验证Nabil Nassif，Jordan Goodend Adv。 SCI。技术。 eng。系统。 J. 2（3），182-185（2017）;查看描述对心理学学生的耻辱 - 终止的有用性和吸引力的描述：一场严肃的比赛，旨在减少精神疾病的耻辱，诺埃尔·纳瓦罗，Juan J. Ojeda，迭戈库戈，何塞A. Piedra，joséGallego adv。 SCI。技术。 eng。系统。 J. 2（3），186-190（2017）;视图说明基于Kinect的移动人类跟踪系统，避免避让人Abdel Mehsen Ahmad，Zouhair Bazzal，Hiba Al Youssef Adv。 SCI。技术。 eng。系统。 J. 2（3），191-197（2017年）;视图描述基于蜜罐的安全方法：保护在线社交网络免受恶意配置文件FATNA Elmendili，Nisrine Maqran，Younes el Bouzekri El Idrissi，Habiba Chaoui Adv。 SCI。技术。 eng。系统。 J. 2（3），198-204（2017）;视图描述超声波压电传感器阵列的基于可编程系统的片上（PSoC）Pedro Acevedo，MartínFentes，JoelDurán，MónicaVázquez，CarlosDíazadv。 SCI。技术。 eng。系统。 J. 2（3），205-209（2017）;查看描述使玩具车辆与可见光通信（VLC）的交互（VLC）M.A.Ilyas，M. B. Othman，S. S. Shah，Mas Fawzi Adv。 SCI。技术。 eng。系统。 J. 2（3），210-216（2017）;查看说明分析分数2xN RLC网络传输矩阵MahmutÜn，ManolyaÜndadv。 SCI。技术。 eng。系统。 J. 2（3），217-220（2017年）;查看描述灭火系统在大型地下车库Ivan Antonov，Rositsa Velichkova，Svetlin Antonov，Kamen Grozdanov，Milka Uzunova，Ikram El Abbassi Adv。 SCI。技术。 eng。系统。 J. 2（3），221-226（2017）;查看说明使用双元频率各种阵列的定向天线调制技术Shaddrack偏航Nusenu Adv。 SCI。技术。 eng。系统。 J. 2（3），227-232（2017）;查看描述使用人工神经网络与乳腺癌与乳腺癌的乳腺X乳头乳腺癌的兴趣区域进行分类，使用人工神经网络EstefaníaD.Avalos-Rivera，Alberto de J. Pastana-Palma Adv。 SCI。技术。 eng。系统。 J.2（3），233-240（2017）;查看描述磁悬浮和引导系统Florian Puci，Miroslav Husak Adv。 SCI。技术。 eng。系统。 J. 2（3），241-244（2017年）;视图说明分布式多功能传感器网络中的节能移动感应minh t. nguyen adv。 SCI。技术。 eng。系统。 J. 2（3），245-253（2017年）;视图描述大分布式数据Ilia Nouretdinov Adv的保形异常检测的有效性和效率。 SCI。技术。 eng。系统。 J. 2（3），254-267（2017年）;查看描述S参数优化在分段和未分段绝缘TSV中高达40GHz频率Juma Mary Atieno，Xuliang Zhang，He Song Bai Adv。 SCI。技术。 eng。系统。 J. 2（3），268-276（2017年）;查看描述综合未来车辆电气系统的重要设计标准Lisa Braun，Eric Sax Adv。 SCI。技术。 eng。系统。 J. 2（3），277-283（2017年）;查看描述虚拟现实环境的故障交互通过数据手套G. Rodriguez，N.Jofre，Y.Alvarado，J.Fernández，R.Guerrero Adv。 SCI。技术。 eng。系统。 J. 2（3），284-290（2017年）;查看描述在两个步骤中解决电容网络设计问题

Reinforcement Learning and Feedback Control: Using Natural Decision Methods to Design Optimal Adaptive Controllers

摘要

著录项

相似文献

相关主题

期刊订阅