首页> 美国卫生研究院文献>Frontiers in Genetics >Controlling for population structure and genotyping platform bias in the eMERGE multi-institutional biobank linked to electronic health records
【2h】

Controlling for population structure and genotyping platform bias in the eMERGE multi-institutional biobank linked to electronic health records

机译:与电子病历关联的eMERGE多机构生物库中人口结构和基因分型平台偏倚的控制

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Combining samples across multiple cohorts in large-scale scientific research programs is often required to achieve the necessary power for genome-wide association studies. Controlling for genomic ancestry through principal component analysis (PCA) to address the effect of population stratification is a common practice. In addition to local genomic variation, such as copy number variation and inversions, other factors directly related to combining multiple studies, such as platform and site recruitment bias, can drive the correlation patterns in PCA. In this report, we describe the combination and analysis of multi-ethnic cohort with biobanks linked to electronic health records for large-scale genomic association discovery analyses. First, we outline the observed site and platform bias, in addition to ancestry differences. Second, we outline a general protocol for selecting variants for input into the subject variance-covariance matrix, the conventional PCA approach. Finally, we introduce an alternative approach to PCA by deriving components from subject loadings calculated from a reference sample. This alternative approach of generating principal components controlled for site and platform bias, in addition to ancestry differences, has the advantage of fewer covariates and degrees of freedom.
机译:通常需要在大规模科学研究计划中组合多个队列中的样本,以实现全基因组关联研究的必要能力。通过主成分分析(PCA)控制基因组祖先以解决种群分层的影响是一种常见的做法。除了局部基因组变异(例如拷贝数变异和倒位)外,与组合多项研究直接相关的其他因素(例如平台和站点募集偏倚)也可以驱动PCA中的相关模式。在本报告中,我们描述了多族群与生物库的结合和分析,这些库与电子健康记录相关联,可进行大规模基因组关联发现分析。首先,我们概述了观察到的站点和平台偏差,以及谱系差异。其次,我们概述了一种常规协议,用于选择变体以输入到主题方差-协方差矩阵中,这是传统的PCA方法。最后,我们通过从参考样本计算得出的受试者负荷中得出分量来引入PCA的另一种方法。除了祖先差异之外,这种生成受站点和平台偏差控制的主成分的替代方法还具有较少的协变量和自由度的优点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号