首页> 外文期刊>BMC proceedings. >Comparing nominal and real quality scores on next-generation sequencing genotype calls
【24h】

Comparing nominal and real quality scores on next-generation sequencing genotype calls

机译:比较下一代测序基因型调用的标称和真实质量得分

获取原文
       

摘要

I seek to comprehensively evaluate the quality of the Genetic Analysis Workshop 17 (GAW17) data set by examining the accuracy of its genotype calls, which were based on the pilot3 data of the 1000 Genomes Project. Taking advantage of the 1000 Genomes Project/ HapMap sample intersect, I compared GAW17 genotype calls to HapMap III, release 2, genotype calls for an individual. These genotype calls should be concordant almost everywhere. Instead I found an astonishingly low 65.4% concordance. Regarding HapMap as the gold standard, I assume that this is a GAW17 data problem and seek to explain this discordance accordingly. I found that a large proportion of this discordance occurred outside targeted regions and that concordance could be improved to at least 94.6% by simply staying within targeted regions, which were sequenced across more samples. Furthermore, I found that in certain individuals, high sample counts did little to improve concordance and concluded that quality scores for a certain sample’s sequence reads were simply incorrect.
机译:我试图通过检查基因型调用的准确性来全面评估遗传分析研讨会17(GAW17)数据集的质量,这些准确性基于1000个基因组计划的pilot3数据。利用1000个基因组计划/ HapMap样本相交的优势,我将GAW17基因型调用与HapMap III,版本2,个体的基因型调用进行了比较。这些基因型的调用几乎在任何地方都应该是一致的。相反,我发现了65.4%的惊人低一致性。关于HapMap作为黄金标准,我假设这是GAW17数据问题,并试图相应地解释这种矛盾。我发现,这种不一致的很大一部分发生在目标区域之外,而仅通过停留在目标区域内就可以将一致性提高到至少94.6%,这些区域在更多样本中进行了测序。此外,我发现在某些人中,高样本数对改善一致性没有太大作用,并得出结论,某些样本的序列读数的质量得分根本不正确。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号