...
首页> 外文期刊>Chemometrics and Intelligent Laboratory Systems >Revealing informative metabolites with random variable combination based on model population analysis for metabolomics data
【24h】

Revealing informative metabolites with random variable combination based on model population analysis for metabolomics data

机译:基于代谢组数据模型分析,用随机变量组合揭示信息性质代谢物

获取原文
获取原文并翻译 | 示例
           

摘要

The discovery of biomarker is a critical and essential step in metabolomics research. With the increasing complexity of metabolomics data generated by high resolution instruments, it is always an urgent need for chemometricians or statisticians to develop a method to efficiently reveal informative metabolites (variables). Based on the framework of model population analysis, a strategy coupled with partial least squares discriminant analysis (PLS-DA), called revealing informative metabolites iteratively (RIMI), was proposed in this study. For the sake of considering the synergetic effect of multiple variables, a vast population of random variable combinations are generated. It is worth pointing out that only the variable combinations with higher model accuracy are used to make paired models in order to statistically assess the importance of each variable in accordance with its beneficial contribution to classification model performance. Four types of variables which include strongly informative, weakly informative, noise and interfering variables, are then identified based on the difference and its significance of the area under the receiver operating characteristic curve (AUROC) values of exclusion and inclusion of each variable. With this definition, unbeneficial variables, including noise and interfering variables, were eliminated iteratively in a mild way. Strongly and weakly informative variables regarded as beneficial variables, are retained, and their P values of t-test are used to reveal the best variable subset. Due to the advantage in exploring useful information from a vast number of variable combinations with good performance, when applied to two metabolomics datasets, RIMI has greatly improved the accuracy value of classification model compared to other methods as the results show. It is indicated that RIMI has efficiently revealed informative metabolites and is regarded as a good alternative for biomarker discovery.
机译:生物标志物的发现是代谢组科研究的重要和基本步骤。随着高分辨率仪器产生的代谢组织数据的复杂性越来越多,迫切需要化学家或统计学家来开发一种有效地揭示信息性代谢物(变量)的方法。基于模型人口分析的框架,在本研究中提出了一种与局部最小二乘判别分析(PLS-DA)相结合的策略,迭代(RIMI)透露揭示信息性代谢物。为了考虑多个变量的协同效果,产生了广泛的随机变量组合。值得注意的是,只有具有更高模型精度的变量组合用于制作配对模型,以便根据其对分类模型性能的有益贡献进行统计评估每个变量的重要性。然后,基于接收器操作特性曲线(AUROC)的排除和包含每个变量的接收器的区域的差异及其意义来识别包含强烈信息,弱信息,噪声和干扰变量的四种类型的变量。通过这种定义,以温和的方式迭代地消除了不束缚的变量,包括噪声和干扰变量。保留了被认为是有益变量的强烈和弱的信息变量,并且它们的T检验的P值用于露出最佳变量子集。由于在良好性能的大量可变组合中探索有用信息的优点,当应用于两个代谢组数据集时,RIMI与结果表明的其他方法相比大大提高了分类模型的精度值。结果表明,RIMI有效地揭示了信息性代谢物,并且被认为是生物标志物发现的良好替代品。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号