首页> 外文会议>Asia-Pacific Bioinformatics Conference >Towards pan-genome read alignment to improve variation calling
【24h】

Towards pan-genome read alignment to improve variation calling

机译:朝向泛基因组读取对齐以改善变化呼叫

获取原文

摘要

Background: Typical human genome differs from the reference genome at 4-5 million sites. This diversity is increasingly catalogued in repositories such as ExAC/gnomAD, consisting of > 15,000 whole-genomes and > 126,000 exome sequences from different individuals. Despite this enormous diversity, resequencing data workflows are still based on a single human reference genome. Identification and genotyping of genetic variants is typically carried out on short-read data aligned to a single reference, disregarding the underlying variation.Results: We propose a new unified framework for variant calling with short-read data utilizing a representation of human genetic variation - a pan-genomic reference. We provide a modular pipeline that can be seamlessly incorporated into existing sequencing data analysis workflows. Our tool is open source and available online: https://gitlab.com/dvalenzu/PanVCConclusions: Our experiments show that by replacing a standard human reference with a pan-genomic one we achieve an improvement in single-nucleotide variant calling accuracy and in short indel calling accuracy over the widely adopted Genome Analysis Toolkit (GATK) in difficult genomic regions.
机译:背景:典型的人类基因组与参考基因组不同4-5百万位点。这种多样性越来越多地在exac / gnomad等储存库中编目,由> 15,000个全基因组成,来自不同个体的> 126,000个exome序列。尽管这种多样性巨大,Resequecing数据工作流程仍然基于单一人参考基因组。遗传变体的鉴定和基因分型通常在与单个参考的短读数据上进行,忽略潜在的变化。结果:我们提出了一种新的统一框架,用于利用人类遗传变异的表示与短读取数据进行变体呼叫 - 泛基因组参考。我们提供了一个模块化管道,可以无缝地纳入现有的测序数据分析工作流程。我们的工具是开源的,可在线提供:https://gitlab.com/dvalenzu/panvccconclusions:我们的实验表明,通过用泛基因组替换标准人权引用,我们可以改善单核苷酸变体呼叫准确性的改进在困难的基因组区域中广泛采用的基因组分析工具包(GATK)呼叫准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号