首页> 美国卫生研究院文献>PLoS Genetics >Improving polygenic prediction from summary data by learning patterns of effect sharing across multiple phenotypes
【2h】

Improving polygenic prediction from summary data by learning patterns of effect sharing across multiple phenotypes

机译:通过学习多个表型之间的效应共享模式改进汇总数据的多基因预测

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Polygenic prediction of complex trait phenotypes has become important in human genetics, especially in the context of precision medicine. Recently, mr.mash, a flexible and computationally efficient method that models multiple phenotypes jointly and leverages sharing of effects across such phenotypes to improve prediction accuracy, was introduced. However, a drawback of mr.mash is that it requires individual-level data, which are often not publicly available. In this work, we introduce mr.mash-rss, an extension of the mr.mash model that requires only summary statistics from Genome-Wide Association Studies (GWAS) and linkage disequilibrium (LD) estimates from a reference panel. By using summary data, we achieve the twin goal of increasing the applicability of the mr.mash model to data sets that are not publicly available and making it scalable to biobank-size data. Through simulations, we show that mr.mash-rss is competitive with, and often outperforms, current state-of-the-art methods for single- and multi-phenotype polygenic prediction in a variety of scenarios that differ in the pattern of effect sharing across phenotypes, the number of phenotypes, the number of causal variants, and the genomic heritability. We also present a real data analysis of 16 blood cell phenotypes in the UK Biobank, showing that mr.mash-rss achieves higher prediction accuracy than competing methods for the majority of traits, especially when the data set has smaller sample size.
机译:复杂性状表型的多基因预测在人类遗传学中变得非常重要,尤其是在精准医学的背景下。最近,mr.mash 问世,这是一种灵活且计算高效的方法,可以联合对多个表型进行建模,并利用这些表型之间的效应共享来提高预测准确性。但是,mr.mash 的一个缺点是它需要个人级别的数据,而这些数据通常不公开。在这项工作中,我们介绍了 mr.mash-rss,这是 mr.mash 模型的扩展,只需要来自全基因组关联研究 (GWAS) 的汇总统计数据和来自参考面板的连锁不平衡 (LD) 估计。通过使用摘要数据,我们实现了双重目标,即提高 mr.mash 模型对未公开可用的数据集的适用性,并使其可扩展至生物样本库规模的数据。通过模拟,我们表明 mr.mash-rss 在各种情况下与当前最先进的单表型和多表型多基因预测方法相比具有竞争力,并且经常优于当前最先进的单表型和多表型多基因预测方法,这些方法在表型之间的效应共享模式、表型数量、因果变异的数量和基因组遗传性方面有所不同。我们还对英国生物库中的 16 种血细胞表型进行了真实数据分析,表明 mr.mash-rss 在大多数性状上实现了比竞争方法更高的预测准确性,尤其是当数据集的样本量较小时。

著录项

代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号