...
首页> 外文期刊>PLoS Computational Biology >Accurate cancer phenotype prediction with AKLIMATE, a stacked kernel learner integrating multimodal genomic data and pathway knowledge
【24h】

Accurate cancer phenotype prediction with AKLIMATE, a stacked kernel learner integrating multimodal genomic data and pathway knowledge

机译:准确的癌症表型预测Aklimate,一个堆叠的内核学习者集成了多模式基因组数据和途径知识

获取原文

摘要

Advancements in sequencing have led to the proliferation of multi-omic profiles of human cells under different conditions and perturbations. In addition, many databases have amassed information about pathways and gene “signatures”—patterns of gene expression associated with specific cellular and phenotypic contexts. An important current challenge in systems biology is to leverage such knowledge about gene coordination to maximize the predictive power and generalization of models applied to high-throughput datasets. However, few such integrative approaches exist that also provide interpretable results quantifying the importance of individual genes and pathways to model accuracy. We introduce AKLIMATE, a first kernel-based stacked learner that seamlessly incorporates multi-omics feature data with prior information in the form of pathways for either regression or classification tasks. AKLIMATE uses a novel multiple-kernel learning framework where individual kernels capture the prediction propensities recorded in random forests, each built from a specific pathway gene set that integrates all omics data for its member genes. AKLIMATE has comparable or improved performance relative to state-of-the-art methods on diverse phenotype learning tasks, including predicting microsatellite instability in endometrial and colorectal cancer, survival in breast cancer, and cell line response to gene knockdowns. We show how AKLIMATE is able to connect feature data across data platforms through their common pathways to identify examples of several known and novel contributors of cancer and synthetic lethality.
机译:在测序的进步已导致在不同的条件和扰动下的人细胞的多组学轮廓的增殖。此外,许多数据库中积累的有关途径和基因“签名”与特定的细胞和表型相关的上下文基因表达的-patterns信息。在系统生物学的一个重要目前的挑战是利用约基因协调这些知识,最大限度的预测能力以及模型推广应用于高通量数据集。然而,很少有这样的综合性的办法存在,它们也提供可解释的结果量化个体的基因和途径,以模型精度的重要性。我们引入AKLIMATE,第一基于内核堆叠学习者可以无缝地结合的多组学特征数据与途径任一回归或分类任务的形式事先信息。 AKLIMATE使用一种新型的多内核学习框架,其中个别内核捕获记录在随机森林的预测倾向,从每一个具体途径的基因组建立了一个集成了所有组学数据,其成员的基因。 AKLIMATE具有相对于上多样的表型的学习任务,包括在子宫内膜和结肠直肠癌预测微卫星不稳定性,在乳腺癌存活和细胞系响应于基因击倒状态的最先进的方法可比较的或改进的性能。我们展示AKLIMATE如何能够通过他们的共同途径跨平台的数据连接功能的数据来识别癌症和合成致死几个已知的和新的贡献者的例子。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号