...
首页> 外文期刊>Knowledge-Based Systems >Multi-step forecasting for big data time series based on ensemble learning
【24h】

Multi-step forecasting for big data time series based on ensemble learning

机译:基于集成学习的大数据时间序列多步预测

获取原文
获取原文并翻译 | 示例
           

摘要

This paper presents ensemble models for forecasting big data time series. An ensemble composed of three methods (decision tree, gradient boosted trees and random forest) is proposed due to the good results these methods have achieved in previous big data applications. The weights of the ensemble are computed by a weighted least square method. Two strategies related to the weight update are considered, leading to a static or dynamic ensemble model. The predictions for each ensemble member are obtained by dividing the forecasting problem intohforecasting sub-problems, one for each value of the prediction horizon. These sub-problems have been solved using machine learning algorithms from the big data engine Apache Spark, ensuring the scalability of our methodology. The performance of the proposed ensemble models is evaluated on Spanish electricity consumption data for 10 years measured with a 10-minute frequency. The results showed that both the dynamic and static ensembles performed well, outperforming the individual ensemble members they combine. The dynamic ensemble was the most accurate model achieving a MRE of 2%, which is a very promising result for the prediction of big time series. Proposed ensembles are also evaluated using solar power from Australia for two years measured with 30-min frequency. The results are successfully compared with Artificial Neural Network, Pattern Sequence-based Forecasting and Deep Learning, improving their results.
机译:本文提出了用于预测大数据时间序列的集成模型。由于这三种方法在以前的大数据应用中取得了良好的效果,因此提出了一种由三种方法(决策树,梯度增强树和随机森林)组成的集合。集合的权重通过加权最小二乘法计算。考虑了与权重更新有关的两种策略,从而得出静态或动态的集成模型。通过将预测问题划分为预先预测的子问题来获得每个集合成员的预测,对于预测范围的每个值一个。使用大数据引擎Apache Spark中的机器学习算法已解决了这些子问题,从而确保了我们方法论的可扩展性。所提出的集成模型的性能是根据西班牙用电数据(以10分钟的频率测量的10年)进行评估的。结果表明,动态和静态合奏均表现良好,胜过它们组合的单个合奏成员。动态集合是实现2%的MRE的最精确模型,这对于预测大时间序列是非常有希望的结果。还使用澳大利亚的太阳能对拟议的乐团进行了为期两年的评估,频率为30分钟。将结果与人工神经网络,基于模式序列的预测和深度学习进行了成功比较,从而改善了结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号