首页> 外文期刊>Applied Energy >Experimental evaluation of model-free reinforcement learning algorithms for continuous HVAC control
【24h】

Experimental evaluation of model-free reinforcement learning algorithms for continuous HVAC control

机译:无模型加固学习算法的实验评价,用于连续HVAC控制

获取原文
获取原文并翻译 | 示例
           

摘要

Controlling heating, ventilation and air-conditioning (HVAC) systems is crucial to improving demand-side energy efficiency. At the same time, the thermodynamics of buildings and uncertainties regarding human activities make effective management challenging. While the concept of model-free reinforcement learning demonstrates various advantages over existing strategies, the literature relies heavily on value-based methods that can hardly handle complex HVAC systems. This paper conducts experiments to evaluate four actor-critic algorithms in a simulated data centre. The performance evaluation is based on their ability to maintain thermal stability while increasing energy efficiency and on their adaptability to weather dynamics. Because of the enormous significance of practical use, special attention is paid to data efficiency. Compared to the model based controller implemented into EnergyPlus, all applied algorithms can reduce energy consumption by at least 10% by simultaneously keeping the hourly average temperature in the desired range. Robustness tests in terms of different reward functions and weather conditions verify these results. With increasing training, we also see a smaller trade-off between thermal stability and energy reduction. Thus, the Soft Actor Critic algorithm achieves a stable performance with ten times less data than on-policy methods. In this regard, we recommend using this algorithm in future experiments, due to both its interesting theoretical properties and its practical results.
机译:控制加热,通风和空调(HVAC)系统对于提高需求侧能量效率至关重要。与此同时,建筑物的热力学和有关人类活动的不确定性取得有效的管理挑战。虽然无模型加强学习的概念来说,展示了对现有策略的各种优势,但文献依赖于基于价值的方法,这几乎无法处理复杂的HVAC系统。本文进行了在模拟数据中心中评估四个演员批评算法的实验。性能评估基于它们能够保持热稳定性的同时增加能量效率以及对天气动态的适应性。由于实际使用的巨大意义,特别关注数据效率。与基于模型的控制器相比,通过实施到EnergyPlus,所有应用算法可以通过同时将每小时平均温度保持在所需范围内的每小时平均气温来降低至少10%。在不同奖励功能和天气条件方面的稳健性测试验证了这些结果。随着培训的增加,我们还看到了热稳定性和能量减少之间的较小折衷。因此,软演员批评算法稳定地实现了比策略方法减少了十倍的性能。在这方面,我们建议在未来的实验中使用该算法,这是其有趣的理论属性及其实际结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号