...
首页> 外文期刊>Analytica chimica acta >Reproducibility, complementary measure of predictability for robustness improvement of multivariate calibration models via variable selections
【24h】

Reproducibility, complementary measure of predictability for robustness improvement of multivariate calibration models via variable selections

机译:重现性,可预测性的补充度量,用于通过变量选择提高多变量校准模型的鲁棒性

获取原文
获取原文并翻译 | 示例
           

摘要

In multivariate calibration with the spectral dataset, variable selection is often applied to identify relevant subset of variables, leading to improved prediction accuracy and easy interpretation of the selected fingerprint regions. Until now, numerous variable selection methods have been proposed, but a proper choice among them is not trivial. Furthermore, in many cases, a set of variables found by those methods might not be robust due to the irreproducibility and uncertainty issues, posing a great challenge in improving the reliability of the variable selection. In this study, the reproducibility of the 5 variable selection methods was investigated quantitatively for evaluating their performance. The reproducibility of variable selection was quantified by using Monte-Carlo sub-sampling (MCS) techniques together with the quantitative similarity measure designed for the highly collinear spectral dataset. The investigation of reproducibility and prediction accuracy of the several variable selection algorithms with two different near-infrared (NIR) datasets illustrated that the different variable selection methods exhibited wide variability in their performance, especially in their capabilities to identify the consistent subset of variables from the spectral datasets. Thus the thorough assessment of the reproducibility together with the predictive accuracy of the identified variables improved the statistical validity and confidence of the selection outcome, which cannot be addressed by the conventional evaluation schemes.
机译:在使用光谱数据集进行多变量校准时,变量选择通常用于识别变量的相关子集,从而提高了预测准确性,并易于解释所选指纹区域。到目前为止,已经提出了许多变量选择方法,但是在它们之中进行适当的选择并不是一件容易的事。此外,在许多情况下,由于不可重复性和不确定性问题,这些方法发现的一组变量可能不那么健壮,这对提高变量选择的可靠性提出了巨大挑战。在这项研究中,对5种变量选择方法的可重复性进行了定量研究,以评估其性能。通过使用蒙特卡洛二次抽样(MCS)技术以及为高度共线光谱数据集设计的定量相似性度量,对变量选择的可重复性进行了量化。对具有两个不同的近红外(NIR)数据集的几种变量选择算法的可重复性和预测准确性的研究表明,不同的变量选择方法在性能上表现出很大的可变性,尤其是在从中识别出一致的变量子集的能力方面。光谱数据集。因此,对可重复性的全面评估以及所识别变量的预测准确性,提高了选择结果的统计有效性和置信度,而常规评估方案无法解决这些问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号