首页> 外文期刊>Chemometrics and Intelligent Laboratory Systems >Distribution based truncation for variable selection in subspace methods for multivariate regression
【24h】

Distribution based truncation for variable selection in subspace methods for multivariate regression

机译:多元回归的子空间方法中基于分布的截断变量选择

获取原文
获取原文并翻译 | 示例
           

摘要

Analysis of data containing a vast number of features, but only a limited number of informative ones, requires methods that can separate true signal from noise variables. One class of methods attempting this is the sparse partial least squares methods for regression (sparse PLS). This paper aims at improving the theoretical foundation, speed and robustness of such methods. A general justification of truncation of PLS loading weights is achieved through distribution theory and the central limit theorem. We also introduce a quick plug-in based truncation procedure based on a novel application of theory intended for analysis of variance for experiments without replicates. The result is a versatile and intuitive method that performs component-wise variable selection very efficiently and in a less ad hoc manner than existing methods. Prediction performance is on par with existing methods, while robustness is ensured through a better theoretical foundation.
机译:对包含大量特征但仅提供少量信息的特征的数据分析,需要能够将真实信号与噪声变量区分开的方法。尝试这种方法的一类方法是稀疏的偏最小二乘回归方法(稀疏PLS)。本文旨在改善这种方法的理论基础,速度和鲁棒性。通过分布理论和中心极限定理,可以得出截短PLS载荷权重的一般理由。我们还基于一种新颖的理论应用介绍了一种基于插件的快速截断过程,该理论旨在用于无重复实验的方差分析。结果是一种通用且直观的方法,与现有方法相比,该方法非常有效且以较少的临时方式执行了组件级变量选择。预测性能与现有方法相当,而鲁棒性是通过更好的理论基础来确保的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号