首页> 美国卫生研究院文献>Metabolites >Comparison of Bi- and Tri-Linear PLS Models for Variable Selection in Metabolomic Time-Series Experiments
【2h】

Comparison of Bi- and Tri-Linear PLS Models for Variable Selection in Metabolomic Time-Series Experiments

机译:代谢组时间序列实验中用于变量选择的双线性和三线性PLS模型的比较

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Metabolomic studies with a time-series design are widely used for discovery and validation of biomarkers. In such studies, changes of metabolic profiles over time under different conditions (e.g., control and intervention) are compared, and metabolites responding differently between the conditions are identified as putative biomarkers. To incorporate time-series information into the variable (biomarker) selection in partial least squares regression (PLS) models, we created PLS models with different combinations of bilinear/trilinear >X and group/time response dummy >Y. In total, five PLS models were evaluated on two real datasets, and also on simulated datasets with varying characteristics (number of subjects, number of variables, inter-individual variability, intra-individual variability and number of time points). Variables showing specific temporal patterns observed visually and determined statistically were labelled as discriminating variables. Bootstrapped-VIP scores were calculated for variable selection and the variable selection performance of five PLS models were assessed based on their capacity to correctly select the discriminating variables. The results showed that the bilinear PLS model with group × time response as dummy >Y provided the highest recall (true positive rate) of 83–95% with high precision, independent of most characteristics of the datasets. Trilinear PLS models tend to select a small number of variables with high precision but relatively high false negative rate (lower power). They are also less affected by the noise compared to bilinear PLS models. In datasets with high inter-individual variability, bilinear PLS models tend to provide higher recall while trilinear models tend to provide higher precision. Overall, we recommend bilinear PLS with group x time response >Y for variable selection applications in metabolomics intervention time series studies.
机译:具有时间序列设计的代谢组学研究被广泛用于生物标志物的发现和验证。在此类研究中,比较了不同条件(例如控制和干预)下代谢曲线随时间的变化,并将条件之间反应不同的代谢物鉴定为假定的生物标记。为了将时间序列信息纳入偏最小二乘回归(PLS)模型中的变量(生物标记)选择中,我们创建了具有双线性/三线性> X 和组/时间响应假人>的不同组合的PLS模型> Y 。总共,在两个真实数据集以及具有变化特征(受试者数量,变量数量,个体间变异性,个体内变异性和时间点数量)的模拟数据集上评估了五个PLS模型。显示视觉上观察到并经统计学确定的显示特定时间模式的变量被标记为区分变量。计算Bootstrapped-VIP分数以进行变量选择,并基于五个PLS模型正确选择区分变量的能力来评估变量选择性能。结果表明,组x时间响应为虚拟> Y 的双线性PLS模型具有最高的召回率(真实阳性率),准确度在83-95%之间,与数据集的大多数特征无关。三线性PLS模型倾向于选择少量的变量,但精度较高,但假阴性率较高(功率较低)。与双线性PLS模型相比,它们受噪声的影响也较小。在个体间变异性高的数据集中,双线性PLS模型倾向于提供更高的召回率,而三线性模型则倾向于提供更高的精度。总体而言,我们建议在代谢组学干预时间序列研究中将具有组x时间响应> Y 的双线性PLS用于变量选择应用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号