...
首页> 外文期刊>Genetics: A Periodical Record of Investigations Bearing on Heredity and Variation >A Maximum-Likelihood Method to Correct for Allelic Dropout in Microsatellite Data with No Replicate Genotypes
【24h】

A Maximum-Likelihood Method to Correct for Allelic Dropout in Microsatellite Data with No Replicate Genotypes

机译:修正无重复基因型微卫星数据中等位基因缺失的最大似然方法

获取原文
获取原文并翻译 | 示例
           

摘要

Allelic dropout is a commonly observed source of missing data in microsatellite genotypes, in which one or both allelic copies at a locus fail to be amplified by the polymerase chain reaction. Especially for samples with poor DNA quality, this problem causes a downward bias in estimates of observed heterozygosity and an upward bias in estimates of inbreeding, owing to mistaken classifications of heterozygotes as homozygotes when one of the two copies drops out. One general approach for avoiding allelic dropout involves repeated genotyping of homozygous loci to minimize the effects of experimental error. Existing computational alternatives often require replicate genotyping as well. These approaches, however, are costly and are suitable only when enough DNA is available for repeated genotyping. In this study, we propose a maximum-likelihood approach together with an expectation-maximization algorithm to jointly estimate allelic dropout rates and allele frequencies when only one set of nonreplicated genotypes is available. Our method considers estimates of allelic dropout caused by both sample-specific factors and locus-specific factors, and it allows for deviation from Hardy-Weinberg equilibrium owing to inbreeding. Using the estimated parameters, we correct the bias in the estimation of observed heterozygosity through the use of multiple imputations of alleles in cases where dropout might have occurred. With simulated data, we show that our method can (1) effectively reproduce patterns of missing data and heterozygosity observed in real data; (2) correctly estimate model parameters, including sample-specific dropout rates, locus-specific dropout rates, and the inbreeding coefficient; and (3) successfully correct the downward bias in estimating the observed heterozygosity. We find that our method is fairly robust to violations of model assumptions caused by population structure and by genotyping errors from sources other than allelic dropout. Because the data sets imputed under our model can be investigated in additional subsequent analyses, our method will be useful for preparing data for applications in diverse contexts in population genetics and molecular ecology.
机译:等位基因缺失是微卫星基因型中丢失数据的普遍观察到的来源,其中一个位点的一个或两个等位基因拷贝未能通过聚合酶链反应扩增。尤其是对于DNA质量较差的样品,此问题会导致观察到的杂合性估计值下降,而近交估计值上升,这是由于当两个拷贝之一丢失时,杂合子被错误地分类为纯合子。避免等位基因缺失的一种通用方法涉及对纯合基因座进行重复基因分型,以最大程度地减少实验误差的影响。现有的计算方法通常也需要重复基因分型。然而,这些方法是昂贵的并且仅当有足够的DNA可用于重复基因分型时才适用。在这项研究中,我们提出了一种最大似然方法以及一种期望最大化算法,以在只有一组未复制的基因型可用时共同估算等位基因缺失率和等位基因频率。我们的方法考虑了由样品特异性因子和基因座特异性因子引起的等位基因缺失的估计,并且由于近亲繁殖,它允许偏离Hardy-Weinberg平衡。使用估计的参数,在可能发生辍学的情况下,我们通过使用等位基因的多个归因来纠正观察到的杂合性估计中的偏差。通过模拟数据,我们证明了我们的方法可以(1)有效地再现丢失数据的模式和在真实数据中观察到的杂合性; (2)正确估计模型参数,包括样本特有的辍学率,基因座特有的辍学率和近交系数; (3)在估计观察到的杂合度时成功纠正了向下偏差。我们发现,我们的方法对于由种群结构和除等位基因缺失以外的其他来源的基因分型错误所引起的模型假设违背具有相当强的鲁棒性。由于可以在其他后续分析中研究根据我们的模型估算的数据集,因此我们的方法将对准备在人群遗传学和分子生态学中的各种背景下应用的数据很有用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号