首页> 美国卫生研究院文献>Genetics >The Use of Family Relationships and Linkage Disequilibrium to Impute Phase and Missing Genotypes in Up to Whole-Genome Sequence Density Genotypic Data
【2h】

The Use of Family Relationships and Linkage Disequilibrium to Impute Phase and Missing Genotypes in Up to Whole-Genome Sequence Density Genotypic Data

机译:利用家族关系和连锁不平衡来估算全基因组序列密度基因型数据中的插补期和基因型缺失。

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

A novel method, called linkage disequilibrium multilocus iterative peeling (LDMIP), for the imputation of phase and missing genotypes is developed. LDMIP performs an iterative peeling step for every locus, which accounts for the family data, and uses a forward–backward algorithm to accumulate information across loci. Marker similarity between haplotype pairs is used to impute possible missing genotypes and phases, which relies on the linkage disequilibrium between closely linked markers. After this imputation step, the combined iterative peeling/forward–backward algorithm is applied again, until convergence. The calculations per iteration scale linearly with number of markers and number of individuals in the pedigree, which makes LDMIP well suited to large numbers of markers and/or large numbers of individuals. Per iteration calculations scale quadratically with the number of alleles, which implies biallelic markers are preferred. In a situation with up to 15% randomly missing genotypes, the error rate of the imputed genotypes was <1% and ∼99% of the missing genotypes were imputed. In another example, LDMIP was used to impute whole-genome sequence data consisting of 17,321 SNPs on a chromosome. Imputation of the sequence was based on the information of 20 (re)sequenced founder individuals and genotyping their descendants for a panel of 3000 SNPs. The error rate of the imputed SNP genotypes was 10%. However, if the parents of these 20 founders are also sequenced, >99% of missing genotypes are imputed correctly.
机译:开发了一种新的方法,称为连锁不平衡多位点去皮(LDMIP),用于插补相位和缺失基因型。 LDMIP为每个基因座执行一个迭代的剥离步骤,该步骤说明了家族数据,并使用前向后向算法在整个基因座上累积信息。单倍型对之间的标记相似性用于推测可能缺失的基因型和阶段,这依赖于紧密连锁的标记之间的连锁不平衡。在该估算步骤之后,再次应用组合的迭代剥离/正向-反向算法,直到收敛为止。每次迭代的计算与标记的数量和谱系中的个体数量成线性比例,这使得LDMIP非常适合大量的标记和/或大量的个体。每次迭代计算与等位基因数量成正比,这暗示着首选双等位基因标记。在随机遗失基因型高达15%的情况下,估算基因型的错误率<1%,而估算遗失基因型的〜99%。在另一个例子中,LDMIP被用于估算由染色体上的17,321个SNP组成的全基因组序列数据。序列的估算是基于20个(重新)测序的创建者个体的信息,并对其一组3000个SNP的后代进行基因分型。估算的SNP基因型的错误率是10%。但是,如果同时对这20个创始人的父母进行测序,则正确估算了> 99%的缺失基因型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号