首页> 外文期刊>Journal of Energy Engineering >Stochastic Optimal CPS Relaxed Control Methodology for Interconnected Power Systems Using Q-Learning Method
【24h】

Stochastic Optimal CPS Relaxed Control Methodology for Interconnected Power Systems Using Q-Learning Method

机译:基于Q学习的互联电力系统随机最优CPS松弛控制方法。

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

This paper presents the application and design of a novel stochastic optimal control methodology based on the Q-learning method for solving the automatic generation control (AGC) under the new control performance standards (CPS) for the North American Electric Reliability Council (NERC). The aims of CPS are to relax the control constraint requirements of AGC plant regulation and enhance the frequency dispatch support effect from interconnected control areas. The NERC's CPS-based AGC problem is a dynamic stochastic decision problem that can be modeled as a reinforcement learning (RL) problem based on the Markov decision process theory. In this paper, the g-learning method is adopted as the RL core algorithm with CPS values regarded as the rewards from the interconnected power systems; the CPS control and relaxed control objectives are formulated as immediate reward functions by means of a linear weighted aggregative approach. By regulating a closed-loop CPS control rule to maximize the long-term discounted reward in the procedure of online learning, the optimal CPS control strategy can be gradually obtained. This paper also introduces a practical semisupervisory group prelearning method to improve the stability and convergence ability of Q-learning controllers during the prelearning process. Tests on the China Southern Power Grid demonstrate that the proposed control strategy can effectively enhance the robustness and relaxation property of AGC systems while CPS compliances are ensured.
机译:本文介绍了基于Q学习方法的新型随机最优控制方法的应用和设计,该方法用于解决北美电力可靠性委员会(NERC)在新的控制性能标准(CPS)下的自动发电控制(AGC)。 CPS的目的是放宽AGC工厂规章的控制约束要求,并增强互连控制区域的频率调度支持效果。 NERC基于CPS的AGC问题是一种动态随机决策问题,可以根据马尔可夫决策过程理论建模为强化学习(RL)问题。本文采用g学习方法作为RL核心算法,将CPS值作为互联电网的回报。通过线性加权综合方法,将CPS控制和放松控制目标制定为即时回报函数。通过在在线学习过程中调节闭环CPS控制规则以最大化长期折价奖励,可以逐步获得最佳CPS控制策略。本文还介绍了一种实用的半监督小组预学习方法,以提高Q学习控制器在预学习过程中的稳定性和收敛能力。在南方电网的测试表明,所提出的控制策略可以有效地增强AGC系统的鲁棒性和松弛特性,同时确保CPS的合规性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号