首页> 外文期刊>Knowledge-Based Systems >Early stopping aggregation in selective variable selection ensembles for high-dimensional linear regression models
【24h】

Early stopping aggregation in selective variable selection ensembles for high-dimensional linear regression models

机译:高维线性回归模型的选择性变量选择集合中的提早停止聚集

获取原文
获取原文并翻译 | 示例
           

摘要

Nowadays, variable selection has become the most popular and effective tool to analyze high-dimensional data. Among the existing approaches, variable selection ensembles (VSEs) have exhibited their great power in improving selection accuracy and stabilizing the results of a traditional selection method. The construction of a VSE generally consists of two phases, i.e., ensemble generation and ensemble aggregation. We study selective VSEs in this paper by inserting a pruning step before combining the generated members into a VSE. As a result, a smaller but more accurate subensemble can be obtained. By taking ST2E (stochastic stepwise ensemble) as our main example, we first extended it to handle high-dimensional data. On the basis of its individuals, the aggregation order is rearranged according to their corresponding RIC, (corrected risk inflation criterion) values. Then, only some members ranked ahead are averaged to estimate the importance measures for each candidate variable. In terms of several variable ranking and selection metrics, experiments conducted with simulated and real-world high-dimensional data show that pruned ST2E is superior to several other benchmark methods in most cases. By analyzing the accuracy-diversity patterns of VSEs, the pruning step is found to exclude less accurate members and lead the reserved members to more concentrate on the true importance vector. (C) 2018 Elsevier B.V. All rights reserved.
机译:如今,变量选择已成为分析高维数据的最流行和最有效的工具。在现有方法中,变量选择集成(VSE)在提高选择精度和稳定传统选择方法的结果方面表现出了巨大的威力。 VSE的构建通常包括两个阶段,即,合奏生成和合奏聚合。我们通过在将生成的成员合并到VSE中之前插入修剪步骤来研究选择性VSE。结果,可以获得较小但更准确的子集合。通过以ST2E(随机逐步集成)为主要示例,我们首先将其扩展为处理高维数据。根据其个体,聚合顺序将根据其相应的RIC(校正后的风险通胀标准)值进行重新排列。然后,仅对排名靠前的一些成员进行平均,以估计每个候选变量的重要性度量。就几种变量排名和选择指标而言,对模拟和真实高维数据进行的实验表明,在大多数情况下,修剪后的ST2E优于其他几种基准方法。通过分析VSE的准确性-多样性模式,发现修剪步骤排除了精度较低的成员,并使保留的成员更加专注于真实重要性向量。 (C)2018 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号