首页> 外文期刊>BMC Bioinformatics >MetICA: independent component analysis for high-resolution mass-spectrometry based non-targeted metabolomics
【24h】

MetICA: independent component analysis for high-resolution mass-spectrometry based non-targeted metabolomics

机译:MetICA:基于高分辨率质谱的非目标代谢组学的独立成分分析

获取原文
           

摘要

Background Interpreting non-targeted metabolomics data remains a challenging task. Signals from non-targeted metabolomics studies stem from a combination of biological causes, complex interactions between them and experimental biasoise. The resulting data matrix usually contain huge number of variables and only few samples, and classical techniques using nonlinear mapping could result in computational complexity and overfitting. Independent Component Analysis (ICA) as a linear method could potentially bring more meaningful results than Principal Component Analysis (PCA). However, a major problem with most ICA algorithms is the output variations between different runs and the result of a single ICA run should be interpreted with reserve. Results ICA was applied to simulated and experimental mass spectrometry (MS)-based non-targeted metabolomics data, under the hypothesis that underlying sources are mutually independent. Inspired from the Icasso algorithm, a new ICA method, MetICA was developed to handle the instability of ICA on complex datasets. Like the original Icasso algorithm , MetICA evaluated the algorithmic and statistical reliability of ICA runs. In addition, MetICA suggests two ways to select the optimal number of model components and gives an order of interpretation for the components obtained. Conclusions Correlating the components obtained with prior biological knowledge allows understanding how non-targeted metabolomics data reflect biological nature and technical phenomena. We could also extract mass signals related to this information. This novel approach provides meaningful components due to their independent nature. Furthermore, it provides an innovative concept on which to base model selection: that of optimizing the number of reliable components instead of trying to fit the data. The current version of MetICA is available at https://?github.?com/?daniellyz/?MetICA .
机译:背景解释非目标代谢组学数据仍然是一项艰巨的任务。来自非靶向代谢组学研究的信号源于多种生物学原因,它们之间复杂的相互作用以及实验偏向/噪声。生成的数据矩阵通常包含大量变量,并且样本很少,使用非线性映射的经典技术可能会导致计算复杂和过度拟合。独立成分分析(ICA)作为线性方法可能比主成分分析(PCA)带来更有意义的结果。但是,大多数ICA算法的一个主要问题是不同运行之间的输出差异,并且单个ICA运行的结果应有备用解释。结果ICA在潜在来源相互独立的假设下,被应用于基于模拟和实验质谱(MS)的非目标代谢组学数据。受Icasso算法启发,开发了一种新的ICA方法MetICA,以处理复杂数据集上ICA的不稳定性。像最初的Icasso算法一样,MetICA评估了ICA运行的算法和统计可靠性。此外,MetICA建议了两种方法来选择最佳数量的模型组件,并给出了获得的组件的解释顺序。结论将先前获得的生物学知识与获得的成分相关联可以了解非目标代谢组学数据如何反映生物学性质和技术现象。我们还可以提取与此信息相关的质量信号。由于其独立性,这种新颖的方法提供了有意义的组件。此外,它提供了一个创新的概念作为模型选择的基础:优化可靠组件的数量而不是尝试拟合数据。 MetICA的当前版本可从https://?github。?com /?daniellyz /?MetICA获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号