...
首页> 外文期刊>Spectrochimica Acta, Part B. Atomic Spectroscopy >Comparison of partial least squares and lasso regression techniques as applied to laser-induced breakdown spectroscopy of geological samples
【24h】

Comparison of partial least squares and lasso regression techniques as applied to laser-induced breakdown spectroscopy of geological samples

机译:偏最小二乘和套索回归技术在地质样品激光诱导击穿光谱中的比较

获取原文
获取原文并翻译 | 示例

摘要

A remote laser-induced breakdown spectrometer (LIBS) designed to simulate the ChemCam instrument on the Mars Science Laboratory Rover Curiosity was used to probe 100 geologic samples at a 9-m standoff distance. ChemCam consists of an integrated remote LIBS instrument that will probe samples up to 7 m from the mast of the rover and a remote micro-imager (RMI) that will record context images. The elemental compositions of 100 igneous and highly-metamorphosed rocks are determined with LIBS using three variations of multivariate analysis, with a goal of improving the analytical accuracy. Two forms of partial least squares (PLS) regression are employed with finely-tuned parameters: PLS-1 regresses a single response variable (elemental concentration) against the observation variables (spectra, or intensity at each of 6144 spectrometer channels), while PLS-2 simultaneously regresses multiple response variables (concentrations of the ten major elements in rocks) against the observation predictor variables, taking advantage of natural correlations between elements. Those results are contrasted with those from the multivariate regression technique of the /east absolute shrinkage and selection operator (lasso), which is a penalized shrunken regression method that selects the specific channels for each element that explain the most variance in the concentration of that element. To make this comparison, we use results of cross-validation and of held-out testing, and employ unsealed and uncentered spectral intensity data because all of the input variables are already in the same units. Results demonstrate that the lasso, PLS-1, and PLS-2 all yield comparable results in terms of accuracy for this dataset However, the interpretability of these methods differs greatly in terms of fundamental understanding of LIBS emissions. PLS techniques generate principal components, linear combinations of intensities at any number of spectrometer channels, which explain as much variance in the response variables as possible while avoiding multicoilinearity between principal components. When the selected number of principal components is projected back into the original feature space of the spectra, 6144 correlation coefficients are generated, a small fraction of which are mathematically significant to the regression. In contrast, the lasso models require only a small number (<24) of non-zero correlation coefficients (p values) to determine the concentration of each of the ten major elements. Causality between the pesitively-correlated emission lines chosen by the lasso and the elemental concentration was examined. In general, the higher the lasso coefficient (P), the greater the likelihood that the selected line results from an emission of that element. Emission lines with negative p values should arise from elements that are anti-correlated with the element being predicted. For elements except Fe, Al, Ti, and P. the lasso-selected wavelength with the highest p value corresponds to the element being predicted, e.g. 559.8 nm for neutral Ca. However, the specific lines chosen by the lasso with positive p values are not always those from the element being predicted. Other wavelengths and the elements that most strongly correlate with them to predict concentration are obviously related to known geochemical correlations or close overlap of emission lines, while others must result from matrix effects. Use of the lasso technique thus directly informs our understanding of the underlying physical processes that give rise to LIBS emissions by determining which lines can best represent concentration, and which lines from other elements are causing matrix effects.
机译:设计用于模拟火星科学实验室流动站好奇号上的ChemCam仪器的远程激光诱导击穿光谱仪(LIBS),用于以9米的对峙距离探测100个地质样品。 ChemCam包括一个集成的远程LIBS仪器和一个远程微成像仪(RMI),该仪器将探测流动站桅杆最远7 m处的样品,并记录上下文图像。 LIBS使用多元分析的三个变体,使用LIBS确定了100个火成岩和高变质岩的元素组成,目的是提高分析精度。精细调整的参数采用两种形式的偏最小二乘(PLS)回归:PLS-1使单个响应变量(元素浓度)相对于观测变量(光谱或6144个光谱仪通道的强度)回归,而PLS- 2利用元素之间的自然相关性,同时针对观测预测变量对多个响应变量(岩石中十个主要元素的浓度)进行了回归。这些结果与/ east绝对收缩和选择算子(lasso)的多元回归技术的结果形成对比,后者是一种受惩罚的缩小回归方法,该方法为每个元素选择特定的通道以解释该元素浓度的最大变化。为了进行比较,我们使用交叉验证和保留测试的结果,并使用未密封和未居中的光谱强度数据,因为所有输入变量已经在同一单位中。结果表明,套索,PLS-1和PLS-2在此数据集的准确性方面均产生了可比的结果。但是,在对LIBS排放的基本了解方面,这些方法的可解释性差异很大。 PLS技术生成主成分,即在任意数量的光谱仪通道处强度的线性组合,这可以解释尽可能多的响应变量方差,同时避免主成分之间存在多重共线性。当将选定数量的主成分投影回光谱的原始特征空间时,将生成6144个相关系数,其中一小部分对回归具有数学意义。相反,套索模型仅需要少量(<24)的非零相关系数(p值)即可确定十个主要元素中每个元素的浓度。检查了套索选择的与气息相关的发射谱线与元素浓度之间的因果关系。通常,套索系数(P)越高,所选线从该元素的发射产生的可能性就越大。具有负p值的发射线应来自与所预测元素反相关的元素。对于除Fe,Al,Ti和P之外的元素,具有最高p值的套索选择波长对应于所预测的元素,例如,中性Ca为559.8 nm。但是,套索选择的具有正p值的特定线并不总是来自所预测元素的特定线。其他波长和与波长最密切相关以预测浓度的元素显然与已知的地球化学相关性或发射线的紧密重叠有关,而其他波长必须由基质效应产生。套索技术的使用因此通过确定哪些线最能代表浓度,以及来自其他元素的哪些线引起基质效应,从而直接帮助我们理解引起LIBS排放的潜在物理过程。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号