首页> 外文期刊>Genetic epidemiology. >Estimation of DNA contamination and its sources in genotyped samples
【24h】

Estimation of DNA contamination and its sources in genotyped samples

机译:基因分型样品中DNA污染及其来源的估算

获取原文
获取原文并翻译 | 示例
           

摘要

Abstract Array genotyping is a cost‐effective and widely used tool that enables assessment of up to millions of genetic markers in hundreds of thousands of individuals. Genotyping array data are typically highly accurate but sensitive to mixing of DNA samples from multiple individuals before or during genotyping. Contaminated samples can lead to genotyping errors and consequently cause false positive signals or reduce power of association analyses. Here, we propose a new method to identify contaminated samples and the sources of contamination within a genotyping batch. Through analysis of array intensity and genotype data from intentionally mixed samples and 22,366 samples of the Michigan Genomics Initiative, an ongoing biobank‐based study, we show that our method can reliably estimate contamination. We also show that identifying sources of contamination can implicate problematic sample processing steps and guide process improvements. Compared to existing methods, our approach can estimate the proportion of contaminating DNA more accurately, eliminate the need for external databases of allele frequencies, and provide contamination estimates that are more robust to the ancestral origin of the contaminating sample.
机译:摘要阵列基因分型是一种成本效益和广泛使用的工具,可以评估数百万人的数百万个遗传标记。基因分型阵列数据通常高度准确,但对在基因分型之前或期间的多个个体的DNA样品混合敏感。污染的样品可以导致基因分型误差,从而导致假阳性信号或降低关联分析的功率。在这里,我们提出了一种新方法来鉴定基因分型批次内的污染样本和污染源。通过分析来自故意混合样品的阵列强度和基因型数据和密歇根基因组学倡议的22,366个样本,是一项正在进行的基于生物安的研究,我们表明我们的方法可以可靠地估算污染。我们还表明,识别污染源可以致力于有问题的样本处理步骤和引导过程改进。与现有方法相比,我们的方法可以更准确地估计污染DNA的比例,消除对等位基因频率的外部数据库的需要,并提供对污染样品的祖先起源更鲁棒的污染估计。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号