首页> 外文会议>International Conference on Systems Biology >Comparison of multi-sample variant calling methods for whole genome sequencing
【24h】

Comparison of multi-sample variant calling methods for whole genome sequencing

机译:全基因组测序的多样本变异调用方法比较

获取原文

摘要

Rapid advancement of next-generation sequencing (NGS) technologies has facilitated the search for genetic susceptibility factors that influence disease risk in the field of human genetics. In particular whole genome sequencing (WGS) has been used to obtain the most comprehensive genetic variation of an individual and perform detailed evaluation of all genetic variation. To this end, sophisticated methods to accurately call high-quality variants and genotypes simultaneously on a cohort of individuals from raw sequence data are required. On chromosome 22 of 818 WGS data from the Alzheimer's Disease Neuroimaging Initiative (ADNI), which is the largest WGS related to a single disease, we compared two multi-sample variant calling methods for the detection of single nucleotide variants (SNVs) and short insertions and deletions (indels) in WGS: (1) reduce the analysis-ready reads (BAM) file to a manageable size by keeping only essential information for variant calling (“REDUCE”) and (2) call variants individually on each sample and then perform a joint genotyping analysis of the variant files produced for all samples in a cohort (“JOINT”). JOINT identified 515,210 SNVs and 60,042 indels, while REDUCE identified 358,303 SNVs and 52,855 indels. JOINT identified many more SNVs and indels compared to REDUCE. Both methods had concordance rate of 99.60% for SNVs and 99.06% for indels. For SNVs, evaluation with HumanOmni 2.5M genotyping arrays revealed a concordance rate of 99.68% for JOINT and 99.50% for REDUCE. REDUCE needed more computational time and memory compared to JOINT. Our findings indicate that the multi-sample variant calling method using the JOINT process is a promising strategy for the variant detection, which should facilitate our understanding of the underlying pathogenesis of human diseases.
机译:下一代测序(NGS)技术的飞速发展促进了对影响人类遗传学领域疾病风险的遗传易感性因素的寻找。特别地,全基因组测序(WGS)已用于获得个体最全面的遗传变异并对所有遗传变异进行详细评估。为此,需要复杂的方法来从原始序列数据中准确地在一群人中同时准确调用高质量的变体和基因型。在来自与单个疾病相关的最大WGS的阿尔茨海默氏病神经影像学计划(ADNI)的818 WGS数据的22号染色体上,我们比较了两种多样本变异检测方法,用于检测单核苷酸变异(SNV)和短插入和WGS中的删除(indels):(1)通过仅保留变体调用(“ REDUCE”)的基本信息来将分析准备就绪的读取(BAM)文件减小到可管理的大小,并且(2)在每个样本上分别保留变体,然后对队列中所有样本产生的变异文件进行联合基因分型分析(“ JOINT”)。 JOINT确定了515,210个SNV和60,042个indel,而REDUCE确定了358,303个SNV和52,855个indel。与REDUCE相比,JOINT识别出更多的SNV和插入/缺失。两种方法对SNV的一致性率为99.60%,对插入缺失的一致性率为99.06%。对于SNV,使用HumanOmni 2.5M基因分型阵列进行的评估显示,JOINT的一致性率为99.68%,REDUCE的一致性率为99.50%。与JOINT相比,REDUCE需要更多的计算时间和内存。我们的发现表明,使用JOINT过程的多样本变异调用方法是变异检测的一种有前途的策略,这应该有助于我们理解人类疾病的潜在发病机理。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号