...
首页> 外文期刊>IEEE Transactions on Neural Networks >Neurocontrollers trained with rules extracted by a genetic assisted reinforcement learning system
【24h】

Neurocontrollers trained with rules extracted by a genetic assisted reinforcement learning system

机译:用遗传辅助强化学习系统提取的规则训练神经控制器

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

This paper proposes a novel system for rule extraction of temporal control problems and presents a new way of designing neurocontrollers. The system employs a hybrid genetic search and reinforcement learning strategy for extracting the rules. The learning strategy requires no supervision and no reference model. The extracted rules are weighted micro rules that operate on small neighborhoods of the admissable control space. A further refinement of the extracted rules is achieved by applying additional genetic search and reinforcement to reduce the number of extracted micro rules. This process results in a smaller set of macro rules which can be used to train a feedforward multilayer perceptron neurocontroller. The micro rules or the macro rules may also be utilized directly in a table look-up controller. As an example of the macro rules-based neurocontroller, we chose four benchmarks. In the first application we verify the capability of our system to learn optimal linear control strategies. The other three applications involve engine idle speed control, bioreactor control, and stabilizing two poles on a moving cart. These problems are highly nonlinear, unstable, and may include noise and delays in the plant dynamics. In terms of retrievals; the neurocontrollers generally outperform the controllers using a table look-up method. Both controllers, though, show robustness against noise disturbances and plant parameter variations.
机译:本文提出了一种新颖的时间控制问题规则提取系统,并提出了一种新的神经控制器设计方法。该系统采用混合遗传搜索和强化学习策略来提取规则。学习策略不需要监督,也不需要参考模型。提取的规则是在允许的控制空间的较小邻域上运行的加权微规则。通过应用额外的遗传搜索和增强以减少提取的微规则的数量,可以进一步完善提取的规则。此过程导致可用于训练前馈多层感知器神经控制器的一组较小的宏规则。微规则或宏规则也可以直接在查表控制器中使用。作为基于宏规则的神经控制器的示例,我们选择了四个基准。在第一个应用程序中,我们验证了系统学习最佳线性控制策略的能力。其他三个应用程序包括发动机怠速控制,生物反应器控制以及稳定移动小车上的两个极。这些问题是高度非线性的,不稳定的,并且可能包括噪声和工厂动态的延迟。在检索方面;神经控制器通常使用查表的方法胜过控制器。不过,这两种控制器都表现出了抗噪声干扰和工厂参数变化的鲁棒性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号