首页> 外文期刊>Journal of Global Optimization >Subset selection for multiple linear regression via optimization
【24h】

Subset selection for multiple linear regression via optimization

机译:通过优化的多个线性回归的子集选择

获取原文
获取原文并翻译 | 示例
           

摘要

Subset selection in multiple linear regression aims to choose a subset of candidate explanatory variables that tradeoff fitting error (explanatory power) and model complexity (number of variables selected). We build mathematical programming models for regression subset selection based on mean square and absolute errors, and minimal-redundancy-maximal-relevance criteria. The proposed models are tested using a linear-program-based branch-and-bound algorithm with tailored valid inequalities and big M values and are compared against the algorithms in the literature. For high dimensional cases, an iterative heuristic algorithm is proposed based on the mathematical programming models and a core set concept, and a randomized version of the algorithm is derived to guarantee convergence to the global optimum. From the computational experiments, we find that our models quickly find a quality solution while the rest of the time is spent to prove optimality; the iterative algorithms find solutions in a relatively short time and are competitive compared to state-of-the-art algorithms; using ad-hoc big M values is not recommended.
机译:多个线性回归中的子集选择旨在选择权衡拟合误差(解释性电源)和模型复杂性(所选变量数)的候选解释性变量的子集。我们基于均线和绝对误差和最小冗余最大关联标准来构建用于回归子集选择的数学编程模型。使用基于线性程序的分支和绑定算法测试所提出的模型,具有量身定制的有效不等式和大M值,并与文献中的算法进行比较。对于高维例,基于数学编程模型和核心集合概念提出了一种迭代启发式算法,并且导出了算法的随机版本,以保证到全局最优的融合。从计算实验中,我们发现我们的模型很快找到了质量解决方案,而其余时间则花费了证明了最优性;迭代算法在相对较短的时间内找到解决方案,与最先进的算法相比是竞争力的;不建议使用ad-hoc大m值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号