...
首页> 外文期刊>Analytica chimica acta >A consensus orthogonal partial least squares discriminant analysis (OPLS-DA) strategy for multiblock Omics data fusion
【24h】

A consensus orthogonal partial least squares discriminant analysis (OPLS-DA) strategy for multiblock Omics data fusion

机译:多块Omics数据融合的共识正交偏最小二乘判别分析(OPLS-DA)策略

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Omics approaches have proven their value to provide a broad monitoring of biological systems. However, as no single analytical technique is sufficient to reveal the full biochemical content of complex biological matrices or biofluids, the fusion of information from several data sources has become a decisive issue. Omics studies generate an increasing amount of massive data obtained from different analytical devices. These data are usually high dimensional and extracting knowledge from these multiple blocks is challenging. Appropriate tools are therefore needed to handle these datasets suitably. For that purpose, a generic methodology is proposed by combining the strengths of established data analysis strategies, i.e. multiple kernel learning and OPLS-DA to of fer an efficient tool for the fusion of Omics data obtained from multiple sources. Three real case studies are proposed to assess the potential of the method. A first example illustrates the fusion of mass spectrometry-based metabolomic data acquired in both negative and positive electrospray ionisation modes, from leaf samples of the model plant Arabidopsis thaliana. A second dataset involves the classification of wine grape varieties based on polyphenolic extracts analysed by two-dimensional heteronuclear magnetic resonance spectroscopy. A third case study underlines the ability of the method to combine heterogeneous data from systems biology with the analysis of publicly available data related to NCi-60 cancer cell lines from different tissue origins, which include metabolomics, transcriptomics and proteomics. The fusion of Omics data from different sources is expected to provide a more complete view of biological systems. The proposed method was demonstrated as a relevant and widely applicable alternative to handle efficiently the inherent characteristics of multiple Omics data, such as very large numbers of noisy collinear variables.
机译:Omics方法已经证明了其对生物系统进行广泛监视的价值。然而,由于没有一种分析技术足以揭示复杂生物基质或生物流体的全部生化成分,因此融合来自多个数据源的信息已成为一个决定性的问题。组学研究从不同的分析设备获得越来越多的海量数据。这些数据通常是高维的,从这些多个块中提取知识是具有挑战性的。因此,需要适当的工具来适当地处理这些数据集。为此,通过结合已建立的数据分析策略(即多核学习和OPLS-DA)的优势,提出了一种通用方法,以提供一种有效的工具来融合从多个来源获得的Omics数据。提出了三个实际案例研究,以评估该方法的潜力。第一个例子说明了从模式植物拟南芥叶片样品中以负电喷雾和正电喷雾电离模式获得的基于质谱的代谢组学数据的融合。第二个数据集涉及基于多异酚提取物的葡萄酒品种分类,该提取物通过二维异核磁共振波谱分析。第三个案例研究强调了该方法将系统生物学的异质数据与来自不同组织起源的与NCi-60癌细胞系有关的公开可用数据进行分析的能力相结合,这些数据包括代谢组学,转录组学和蛋白质组学。来自不同来源的Omics数据的融合有望提供生物系统的更完整视图。所提出的方法被证明是一种有效且有效地处理多个Omics数据固有特征(如非常多的嘈杂共线性变量)的相关替代方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号