首页> 外文期刊>Nature reviews neuroscience >Comparison and evaluation of integrative methods for the analysis of multilevel omics data: a study based on simulated and experimental cancer data
【24h】

Comparison and evaluation of integrative methods for the analysis of multilevel omics data: a study based on simulated and experimental cancer data

机译:多级常型数据分析综合方法的比较与评价:基于模拟和实验癌症数据的研究

获取原文
获取原文并翻译 | 示例
           

摘要

Integrative analysis aims to identify the driving factors of a biological process by the joint exploration of data from multiple cellular levels. The volume of omics data produced is constantly increasing, and so too does the collection of tools for its analysis. Comparative studies assessing performance and the biological value of results, however, are rare but in great demand. We present a comprehensive comparison of three integrative analysis approaches, sparse canonical correlation analysis (sCCA), non-negative matrix factorization (NMF) and logic data mining MicroArray Logic Analyzer (MALA), by applying them to simulated and experimental omics data. We find that sCCA and NMF are able to identify differential features in simulated data, while the Logic Data Mining method, MALA, falls short. Applied to experimental data, we show that MALA performs best in terms of sample classification accuracy, and in general, the classification power of prioritized feature sets is high (97.1-99.5% accuracy). The proportion of features identified by at least one of the other methods, however, is approximately 60% for sCCA and NMF and nearly 30% for MALA, and the proportion of features jointly identified by all methods is only around 16%. Similarly, the congruence on functional levels (Gene Ontology, Reactome) is low. Furthermore, the agreement of identified feature sets with curated gene signatures relevant to the investigated disease is modest. We discuss possible reasons for the moderate overlap of identified feature sets with each other and with curated cancer signatures. The R code to create simulated data, results and figures is provided at https://github.com/ThallingerLab/IamComparison.
机译:一体化分析旨在通过联合探索来自多种细胞水平的数据的联合探索来确定生物学过程的驱动因素。产生的OMICS数据的体积不断增加,因此对其分析的工具的收集也是如此。然而,评估表现和结果的生物价值的比较研究是罕见的,但需求量很大。我们通过将它们应用于模拟和实验OMICS数据,呈现了三种一体化分析方法,稀疏规范相关分析(SCCA),非负数矩阵分子分子(NMF)和逻辑数据挖掘微阵列逻辑分析仪(MALA)的全面比较。我们发现SCCA和NMF能够识别模拟数据中的差异功能,而逻辑数据采矿方法MALA,则缩短。应用于实验数据,我们表明MALA在样品分类准确性方面表现最佳,通常,优先功能集的分类功率高(精度为97.1-99.5%)。然而,通过其他方法鉴定的特征的比例为SCCA和NMF的60%,对于MALA,近30%,并且所有方法共同识别的特征比例仅为16%。同样,对功能级别(基因本体,反应)的一致性低。此外,鉴定特征集的协议具有与调查疾病相关的疗法基因签名是适度的。我们讨论鉴定特征集的适度重叠彼此且愈合癌症签名的可能原因。要在https://github.com/thallingerlab/iamcomporison提供了创建模拟数据,结果和数字的R代码。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号