首页> 外文期刊>Genetic epidemiology. >Characterization and correction of error in genome-wide ibd estimation for samples with population structure
【24h】

Characterization and correction of error in genome-wide ibd estimation for samples with population structure

机译:具有种群结构的样本的全基因组ibd估计中误差的表征和校正

获取原文
获取原文并翻译 | 示例
           

摘要

The proportion of the genome that is shared identical by descent (IBD) between pairs of individuals is often estimated in studies involving genome-wide SNP data. These estimates can be used to check pedigrees, estimate heritability, and adjust association analyses. We focus on the method of moments technique as implemented in PLINK [Purcell et al., 2007] and other software that estimates the proportions of the genome at which two individuals share 0, 1, or 2 alleles IBD. This technique is based on the assumption that the study sample is drawn from a single, homogeneous, randomly mating population. This assumption is violated if pedigree founders are drawn from multiple populations or include admixed individuals. In the presence of population structure, the method of moments estimator has an inflated variance and can be biased because it relies on sample-based allele frequency estimates. In the case of the PLINK estimator, which truncates genome-wide sharing estimates at zero and one to generate biologically interpretable results, the bias is most often towards over-estimation of relatedness between ancestrally similar individuals. Using simulated pedigrees, we are able to demonstrate and quantify the behavior of the PLINK method of moments estimator under different population structure conditions. We also propose a simple method based on SNP pruning for improving genome-wide IBD estimates when the assumption of a single, homogeneous population is violated.
机译:通常在涉及全基因组SNP数据的研究中估计成对的个体之间通过血统(IBD)共享相同的基因组比例。这些估计值可用于检查谱系,估计遗传力以及调整关联分析。我们关注于PLINK [Purcell et al。,2007]和其他软件中实施的矩技术方法,该方法估算了两个个体共享0、1或2个等位基因IBD的基因组比例。该技术基于以下假设:研究样本来自单一的,均质的,随机交配的种群。如果家谱创建者来自多个人群或包含混合个体,则违反此假设。在种群结构存在的情况下,矩估计器的方差膨胀过大,并且由于其依赖于基于样本的等位基因频率估计,因此可能会产生偏差。在PLINK估计器的情况下,它会在0和1处截断全基因组共享估计以产生生物学上可解释的结果,这种偏见通常是对祖先相似个体之间的相关性过高估计。使用模拟的谱系,我们能够证明和量化不同人口结构条件下矩估计器的PLINK方法的行为。我们还提出了一种基于SNP修剪的简单方法,可在违反单个同质种群假设的情况下改善全基因组IBD估计值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号