首页> 外文期刊>Journal of Chemometrics >EVOLUTIONARY VARIABLE SELECTION IN REGRESSION AND PLS ANALYSES
【24h】

EVOLUTIONARY VARIABLE SELECTION IN REGRESSION AND PLS ANALYSES

机译:回归和PLS分析中的进化变量选择。

获取原文
获取原文并翻译 | 示例
       

摘要

Evolutionary and genetic algorithms are powerful tools for searching global optima of complex functions. An evolutionary approach, the MUSEUM (mutation and selection uncover models) programme, is applied to various QSAR data sets to prove the general applicability of this approach for variable selection in regression and PLS analyses. 'Best' regression models are found within seconds or a few minutes of calculation time, even for data sets including large numbers of variables. The MUSEUM algorithm starts from an arbitrary model and adds or eliminates variables to or from this model in a random manner. Any 'better' model defined by a certain fitness criterion is taken as a new breeding organism which is mutated by further variable additions, eliminations or exchanges. In this manner the models improve gradually until a global optimum or at least a good local optimum results. In most cases several different models are obtained from different runs. A systematic search for the best models indicates that in all cases the global optima and good local optima result from the evolutionary search. Most often the fit and cross-validation results of these regression models are better than the fit and cross-validation results of a PLS analysis which includes all variables of the data set. The variables contained in the best regression models are suitable as subsets for PLS analyses and some of these PLS results are even better than the best regression results.
机译:进化和遗传算法是搜索复杂函数全局最优的强大工具。一种进化方法,MUSEUM(变异和选择发现模型)程序,被应用于各种QSAR数据集,以证明该方法在回归和PLS分析中用于变量选择的一般适用性。即使对于包含大量变量的数据集,也可以在几秒钟或几分钟的计算时间内找到“最佳”回归模型。 MUSEUM算法从任意模型开始,并以随机方式向该模型添加变量或从模型中消除变量。由某种适应性标准定义的任何“更好”模型都将被视为新的繁殖生物,并通过进一步的可变添加,消除或交换而发生突变。以这种方式,模型逐渐改善,直到达到全局最优或至少良好的局部最优结果为止。在大多数情况下,可以从不同的运行中获得几种不同的模型。对最佳模型的系统搜索表明,在所有情况下,进化搜索都会产生全局最优和良好的局部最优。通常,这些回归模型的拟合和交叉验证结果要好于PLS分析的拟合和交叉验证结果,PLS分析包括数据集的所有变量。最佳回归模型中包含的变量适合作为PLS分析的子集,其中一些PLS结果甚至比最佳回归结果还要好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号