首页> 外文会议>Euromicro International Conference on Parallel, Distributed and Network-Based Processing >ParallNormal: an efficient variant calling pipeline for unmatched sequencing data
【24h】

ParallNormal: an efficient variant calling pipeline for unmatched sequencing data

机译:parallnormal:用​​于无与伦比的测序数据的有效变体调用管道

获取原文

摘要

Nowadays, next generation sequencing is closer to clinical application in the field of oncology. Indeed, it allows the identification of tumor-specific mutations acquired during cancer development, progression and resistance to therapy. In parallel with an evolving sequencing technology, novel computational approaches are needed to cope with the requirement of a rapid processing of sequencing data into a list of clinically-relevant genomic variants. Since sequencing data from both tumors and their matched normal samples are not always available (unmatched data), there is a need of a computational pipeline leading to variants calling in unmatched data. Despite the presence of many accurate and precise variant calling algorithms, an efficient approach is still lacking. Here, we propose a parallel pipeline (ParallNormal) designed to efficiently identify genomic variants from whole-exome sequencing data, in absence of their matched normal. ParallNormal integrates well-known algorithms such as BWA and GATK, a novel tool for duplicate removal (DuplicateRemove), and the FreeBayes variant calling algorithm. A re-engineered implementation of FreeBayes, optimized for execution on modern multi-core architectures is also proposed. ParallNormal was applied on whole-exome sequencing data of pancreatic cancer samples without considering their matched normal. The robustness of ParallNormal was tested using results of the same dataset analyzed using matched normal samples and considering genes involved in pancreatic carcinogenesis. Our pipeline was able to confirm most of the variants identified using matched normal data.
机译:如今,下一代测序更接近肿瘤学领域的临床应用。实际上,它允许鉴定癌症发育期间获得的肿瘤特异性突变,进展和治疗抗性。与不断变化的测序技术并行,需要新的计算方法来应对要求将数据汇流排序到临床相关的基因组变体列表中的要求。由于来自两个肿瘤和匹配的正常样本的测序数据并不总是可用的(无与伦比的数据),因此需要计算管道,导致呼叫无与伦比的数据的变体。尽管存在许多准确和精确的变体呼叫算法,但仍然缺乏有效的方法。在这里,我们提出了一种平行的管道(Parallnormal),其旨在有效地识别来自全外测序数据的基因组变体,在不存在匹配的正常情况下。 Parallnormal将众所周知的算法(如BWA和GATK)集成,这是一款重复删除(Duplicateremove)的新型工具,以及FreeBayes变体调用算法。还提出了重新设计的FreeBayes的实施,优化用于在现代多核架构上执行。 Parallnormal用于胰腺癌样品的全末端测序数据,而不考虑其匹配的正常情况。使用使用匹配的正常样品分析的相同数据集的结果测试持续的瘫痪的鲁棒性,并考虑涉及胰腺癌的基因。我们的管道能够确认使用匹配的正常数据确定的大多数变体。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号