...
首页> 外文期刊>Neurocomputing >A methodology for training set instance selection using mutual information in time series prediction
【24h】

A methodology for training set instance selection using mutual information in time series prediction

机译:在时间序列预测中使用互信息训练集合实例的方法

获取原文
获取原文并翻译 | 示例

摘要

Training set instance selection is an important preprocessing step in many machine learning problems, including time series prediction, and has to be considered in practice in order to increase the quality of the predictions and possibly reduce training time. Recently, the usage of mutual information (MI) has been proposed in regression tasks, mostly for feature selection and for identifying the real data from data sets that contain noise and outliers. This paper proposes a new methodology for training set instance selection for long-term time series prediction. The proposed methodology combines a recursive prediction strategy and advanced instance selection criterion-the nearest neighbor based MI estimator. An application of the concept of MI is presented for the selection of training instances based on MI computation between initial training set instances and the current forecasting instance, for every prediction step. The novelty of the approach lies in the fact that it fits the initial training subset with the current forecasting instance, and consequently reduces the uncertainty of the prediction. In this way, by selecting instances which share a large amount of MI with the current forecasting instance in every prediction step, error propagation and accumulation can be reduced, both of which are well known shortcomings of the recursive prediction strategy, thus leading to better forecasting quality. Another element which sets this approach apart from others is that it is not proposed as an outlier detector, but for the instance selection of data which do not necessarily have to contain noise and outliers. The results obtained from the data sets from NN5 competition in time series prediction indicate that the proposed method increases the quality of long-term time series prediction, as well as reduces the amount of instances needed for building the model.
机译:训练集实例选择是许多机器学习问题(包括时间序列预测)中重要的预处理步骤,必须在实践中加以考虑,以提高预测的质量并可能减少训练时间。最近,已经提出在回归任务中使用互信息(MI),主要用于特征选择以及从包含噪声和异常值的数据集中识别真实数据。本文提出了一种用于长期时间序列预测的训练集实例选择的新方法。所提出的方法结合了递归预测策略和高级实例选择准则-基于最近邻的MI估计器。提出了MI概念的应用,用于针对每个预测步骤,基于初始训练集实例与当前预测实例之间的MI计算来选择训练实例。该方法的新颖之处在于,它使初始训练子集与当前的预测实例相适应,因此减少了预测的不确定性。这样,通过在每个预测步骤中选择与当前预测实例共享大量MI的实例,可以减少错误传播和累积,这都是递归预测策略的众所周知的缺点,因此可以更好地进行预测质量。将这种方法与其他方法区分开的另一个元素是,它不建议用作离群值检测器,但是对于实例数据的选择并不一定要包含噪声和离群值。从时间序列预测中NN5竞争的数据集获得的结果表明,该方法提高了长期时间序列预测的质量,并且减少了构建模型所需的实例数量。

著录项

  • 来源
    《Neurocomputing》 |2014年第2期|236-245|共10页
  • 作者单位

    College of Applied Technical Sciences,Aleksandra Medvedeva 20,18000 Nis,Serbia,Bulevar doktora Zorana Dindica 29/5,18000 Nis,Serbia,Aleksandra Medvedeva 20,18000 Nis,Serbia,College of Applied Technical Sciences,Nis,Serbia;

    Faculty of Electronic Engineering,University of Nis,Aleksandra Medvedeva 14,18000 Nis,Serbia;

    Faculty of Electronic Engineering,University of Nis,Aleksandra Medvedeva 14,18000 Nis,Serbia;

    Faculty of Electronic Engineering,University of Nis,Aleksandra Medvedeva 14,18000 Nis,Serbia;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Instance selection; Mutual information; Time-series prediction;

    机译:实例选择;相互信息;时间序列预测;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号