首页> 美国卫生研究院文献>PLoS Computational Biology >Heterogeneous Network Edge Prediction: A Data Integration Approach to Prioritize Disease-Associated Genes
【2h】

Heterogeneous Network Edge Prediction: A Data Integration Approach to Prioritize Disease-Associated Genes

机译:异构网络边缘预测:一种优先考虑疾病相关基因的数据集成方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The first decade of Genome Wide Association Studies (GWAS) has uncovered a wealth of disease-associated variants. Two important derivations will be the translation of this information into a multiscale understanding of pathogenic variants and leveraging existing data to increase the power of existing and future studies through prioritization. We explore edge prediction on heterogeneous networks—graphs with multiple node and edge types—for accomplishing both tasks. First we constructed a network with 18 node types—genes, diseases, tissues, pathophysiologies, and 14 MSigDB (molecular signatures database) collections—and 19 edge types from high-throughput publicly-available resources. From this network composed of 40,343 nodes and 1,608,168 edges, we extracted features that describe the topology between specific genes and diseases. Next, we trained a model from GWAS associations and predicted the probability of association between each protein-coding gene and each of 29 well-studied complex diseases. The model, which achieved 132-fold enrichment in precision at 10% recall, outperformed any individual domain, highlighting the benefit of integrative approaches. We identified pleiotropy, transcriptional signatures of perturbations, pathways, and protein interactions as influential mechanisms explaining pathogenesis. Our method successfully predicted the results (with AUROC = 0.79) from a withheld multiple sclerosis (MS) GWAS despite starting with only 13 previously associated genes. Finally, we combined our network predictions with statistical evidence of association to propose four novel MS genes, three of which (JAK2, REL, RUNX3) validated on the masked GWAS. Furthermore, our predictions provide biological support highlighting REL as the causal gene within its gene-rich locus. Users can browse all predictions online (). Heterogeneous network edge prediction effectively prioritized genetic associations and provides a powerful new approach for data integration across multiple domains.
机译:基因组广泛关联研究(GWAS)的头十年发现了许多与疾病相关的变异。两个重要的推导将是将该信息转换为对致病变体的多尺度理解,并利用现有数据来通过优先化来提高现有和未来研究的力量。我们探索异构网络上的边缘预测(具有多个节点和边缘类型的图形)以完成两项任务。首先,我们构建了一个具有18种节点类型的网络-基因,疾病,组织,病理生理学和14种MSigDB(分子签名数据库)集合-以及来自高通量公共可用资源的19种边缘类型。从由40,343个节点和1,608,168个边缘组成的网络中,我们提取了描述特定基因与疾病之间拓扑结构的特征。接下来,我们从GWAS关联中训练了一个模型,并预测了每种蛋白质编码基因与29种经过充分研究的复杂疾病中每种之间的关联概率。该模型在10%的查全率下实现了132倍的精度提升,其性能优于任何单个领域,从而突出了集成方法的优势。我们确定了多效性,扰动,通路和蛋白质相互作用的转录特征作为解释发病机理的影响机制。尽管仅从13个先前相关的基因开始,但我们的方法成功地从多发性硬化症(MS)GWAS预测了结果(AUROC = 0.79)。最后,我们将网络预测与关联的统计证据结合起来,提出了四个新的MS基因,其中三个(JAK2,REL,RUNX3)在被掩盖的GWAS上得到了验证。此外,我们的预测提供了生物学支持,突出了REL作为其丰富基因位点内的因果基因。用户可以在线浏览所有预测()。异构网络边缘预测有效地确定了遗传关联的优先级,并为跨多个域的数据集成提供了强大的新方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号