首页> 外文会议>IEEE International Conference on e-Science >A portable and scalable workflow for detecting structural variants in whole-genome sequencing data
【24h】

A portable and scalable workflow for detecting structural variants in whole-genome sequencing data

机译:用于检测全基因组测序数据中结构变体的便携式和可扩展工作流程

获取原文

摘要

I. Introduction Cancer affects millions of people worldwide. With the advent of novel DNA sequencing technologies, whole genome sequencing (WGS) is becoming an integral part of cancer diagnostics that can potentially enable tailored treatments of individual patients. Despite the advances in large-scale cancer genomics projects (such as TCGA and ICGC's PCAWG) systematic and comprehensive analysis of massive genomic data, in particular the detection and interpretation of structural variations (SVs) in the genomes, remains challenging due to computational and algorithmic limitations [1-2]. A range of methods is available to detect SVs in short-read sequencing data, each producing different results. Therefore, comprehensive SV detection requires the use of multiple methods or tools (callers). In fact, most SV callers implement more than one approach including evidence from split read information, discordantly aligned read pairs, read depth and short-read assembly to improve sensitivity and/or specificity [3-6]. Alternatively, multiple tools, often written in different languages, can be readily combined using a workflow management system. However, a workflow developed on one computing system is not necessarily portable to or reusable on another system due to the complexity of software environments involved, system usage policies or the use of different batch schedulers by HPC clusters (e.g. Grid Engine, Slurm or Torque).
机译:I.引言癌症影响全世界数百万人。随着新型DNA测序技术的出现,整个基因组测序(WGS)正在成为癌症诊断的组成部分,这可能能够使个体患者定制治疗。尽管大规模癌症基因组学项目(如TCGA和ICGC的PCAWG)进行了系统和全面分析了大规模基因组数据,但特别是由于计算和算法,基因组中结构变异(SV)的检测和解释仍然具有挑战性限制[1-2]。可以使用一系列方法来检测短读排序数据中的SV,每个方法产生不同的结果。因此,全面的SV检测需要使用多种方法或工具(呼叫者)。事实上,大多数SV呼叫者实现了多种方法,包括来自分裂读取信息的证据,不和谐对齐读对,读取深度和短读组件以提高灵敏度和/或特异性[3-6]。或者,可以使用工作流管理系统容易地组合多种用不同语言编写的多个工具。然而,由于所涉及的软件环境的复杂性,通过HPC集群(例如网格引擎,浆料或扭矩),在一个计算系统上开发的在一个计算系统上开发的工作流程不一定在另一个系统上便携或在另一个系统上可重复使用或使用不同的批处理调度程序(例如网格引擎,浆料或扭矩) 。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号