首页> 外文期刊>Control Engineering Practice >Real-time energy purchase optimization for a storage-integrated photovoltaic system by deep reinforcement learning
【24h】

Real-time energy purchase optimization for a storage-integrated photovoltaic system by deep reinforcement learning

机译:深增强学习的实时能源采购优化存储 - 集成光伏系统

获取原文
获取原文并翻译 | 示例

摘要

The objective of this article is to minimize the cost of energy purchased on a real-time basis for a storage-integrated photovoltaic (PV) system installed in a microgrid. Under non-linear storage charging/discharging characteristics, as well as uncertain solar energy generation, demands, and market prices, it is a complex task. It requires a proper level of tradeoff between storing too much and too little energy in the battery: future excess PV energy is lost in the former case, and demand is exposed to future high electricity prices in the latter case. We propose a reinforcement learning approach to deal with a non-stationary environment and non-linear storage characteristics. To make this approach applicable, a novel formulation of the decision problem is presented, which focuses on the optimization of grid energy purchases rather than on direct storage control. This limits the complexity of the state and action space, making it possible to achieve satisfactory learning speed and avoid stability issues. Then the Q-learning algorithm combined with a dense deep neural network for function representation is used to learn an optimal decision policy. The algorithm incorporates enhancements that were found to improve learning speed and stability by prior work, such as experience replay, target network, and increasing discount factor. Extensive simulation results performed on real data confirm that our approach is effective and outperforms rule-based heuristics.
机译:本文的目的是最大限度地减少在MicroGrid中安装的存储集成光伏(PV)系统的实时购买的能量成本。在非线性蓄电充电/放电特性下,以及不确定的太阳能发电,需求和市场价格,这是一个复杂的任务。它需要在电池中的储存太多和太少的能量之间需要适当的权衡:前案件丢失了未来的光伏能量,并且在后一种情况下需求暴露于未来的高电价。我们提出了一种加强学习方法来处理非静止环境和非线性存储特性。为了使这种方法适用,提出了一种新的决策问题的制定,专注于优化网格能源购买而不是直接存储控制。这限制了状态和行动空间的复杂性,使得可以实现令人满意的学习速度并避免稳定性问题。然后,与函数表示的密集深神经网络相结合的Q学习算法用于学习最佳决策策略。该算法包含了通过事先工作来提高学习速度和稳定性的增强功能,例如经验重放,目标网络以及增加折扣因子。对实际数据进行的广泛仿真结果证实了我们的方法是有效和优于基于规则的启发式信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号