首页> 外文期刊>Cancer Informatics >Adaptive Multiview Nonnegative Matrix Factorization Algorithm for Integration of Multimodal Biomedical Data
【24h】

Adaptive Multiview Nonnegative Matrix Factorization Algorithm for Integration of Multimodal Biomedical Data

机译:集成多模态生物医学数据的自适应多视图非负矩阵分解算法

获取原文
           

摘要

The amounts and types of available multimodal tumor data are rapidly increasing, and their integration is critical for fully understanding the underlying cancer biology and personalizing treatment. However, the development of methods for effectively integrating multimodal data in a principled manner is lagging behind our ability to generate the data. In this article, we introduce an extension to a multiview nonnegative matrix factorization algorithm (NNMF) for dimensionality reduction and integration of heterogeneous data types and compare the predictive modeling performance of the method on unimodal and multimodal data. We also present a comparative evaluation of our novel multiview approach and current data integration methods. Our work provides an efficient method to extend an existing dimensionality reduction method. We report rigorous evaluation of the method on large-scale quantitative protein and phosphoprotein tumor data from the Clinical Proteomic Tumor Analysis Consortium (CPTAC) acquired using state-of-the-art liquid chromatography mass spectrometry. Exome sequencing and RNA-Seq data were also available from The Cancer Genome Atlas for the same tumors. For unimodal data, in case of breast cancer, transcript levels were most predictive of estrogen and progesterone receptor status and copy number variation of human epidermal growth factor receptor 2 status. For ovarian and colon cancers, phosphoprotein and protein levels were most predictive of tumor grade and stage and residual tumor, respectively. When multiview NNMF was applied to multimodal data to predict outcomes, the improvement in performance is not overall statistically significant beyond unimodal data, suggesting that proteomics data may contain more predictive information regarding tumor phenotypes than transcript levels, probably due to the fact that proteins are the functional gene products and therefore a more direct measurement of the functional state of the tumor. Here, we have applied our proposed approach to multimodal molecular data for tumors, but it is generally applicable to dimensionality reduction and joint analysis of any type of multimodal data.
机译:可获得的多峰肿瘤数据的数量和类型正在迅速增加,它们的整合对于充分了解基础癌症生物学和个性化治疗至关重要。但是,以原则方式有效集成多模式数据的方法的开发落后于我们生成数据的能力。在本文中,我们介绍了多视图非负矩阵分解算法(NNMF)的扩展,用于减少异类数据类型的维数和集成,并比较了该方法在单峰和多峰数据上的预测建模性能。我们还提出了对我们新颖的多视图方法和当前数据集成方法的比较评估。我们的工作提供了一种有效的方法来扩展现有的降维方法。我们报告了从使用先进液相色谱质谱仪获得的临床蛋白质组学肿瘤分析协会(CPTAC)大规模定量蛋白质和磷蛋白肿瘤数据中对该方法的严格评估。还可以从The Cancer Genome Atlas获得相同肿瘤的外显子组测序和RNA-Seq数据。对于单峰数据,在乳腺癌的情况下,转录水平最能预测雌激素和孕激素受体状态以及人类表皮生长因子受体2状态的拷贝数变异。对于卵巢癌和结肠癌,磷蛋白和蛋白水平分别最能预测肿瘤的分级,分期和残留肿瘤。当将多视图NNMF应用于多模式数据以预测结果时,性能的改善在总体上没有超越单模式数据的统计学显着性,这表明蛋白质组学数据可能包含比转录本水平更多的有关肿瘤表型的预测信息,这可能是由于蛋白质是功能基因产物,因此可以更直接地测量肿瘤的功能状态。在这里,我们已经将我们提出的方法应用于肿瘤的多峰分子数据,但是它通常适用于降维和任何类型的多峰数据的联合分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号