首页> 外文会议>IEEE Symposium Series on Computational Intelligence >A New Random Forest Method for Longitudinal Data Classification Using a Lexicographic Bi-Objective Approach
【24h】

A New Random Forest Method for Longitudinal Data Classification Using a Lexicographic Bi-Objective Approach

机译:利用词典双目标方法进行纵向数据分类的新随机森林方法

获取原文

摘要

Standard supervised machine learning methods often ignore the temporal information represented in longitudinal data, but that information can lead to more precise predictions in classification tasks. Data preprocessing techniques and classification algorithms can be adapted to cope directly with longitudinal data inputs, making use of temporal information such as the timeindex of features and previous measurements of the class variable. In this article, we propose two changes to the classification task of predicting age-related diseases in a real-world dataset created from the English Longitudinal Study of Ageing. First, we explore the addition of previous measurements of the class variable, and estimating the missing data in those added features using intermediate classifiers. Second, we propose a new splitfeature selection procedure for a random forest’s decision trees, which considers the candidate features’ time-indexes, in addition to the information gain ratio. Our experiments compared the proposed approaches to baseline approaches, in 3 prediction scenarios, varying the “time gap” for the prediction – how many years in advance the class (occurrence of an age-related disease) is predicted. The experiments were performed on 10 datasets varying the class variable, and showed that the proposed approaches increased the random forest’s predictive accuracy.
机译:标准监督机器学习方法通​​常忽略在纵向数据中表示的时间信息,但是该信息可以导致分类任务中的更精确的预测。数据预处理技术和分类算法可以适于直接用纵向数据输入来应对,利用诸如特征的TimeIndex的时间信息和类变量的先前测量。在本文中,我们提出了对从老龄化的英语纵向研究创造的真实数据集预测年龄相关疾病的分类任务的两个变化。首先,我们探索添加类变量的先前测量,并使用中间分类器估计丢失的数据中的数据。其次,除了信息增益比之外,我们提出了一种用于随机森林的决策树的新分裂选择程序,该决策树将考虑候选功能的时间索引。我们的实验将提出的基线方法的方法与3个预测情景相比,改变了预测的“时间差距” - 预先提前多年(相关疾病的发生)。对10个不同类变量的数据集进行实验,并显示提出的方法增加了随机森林的预测精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号