...
首页> 外文期刊>PLoS One >Evaluation of serverless computing for scalable execution of a joint variant calling workflow
【24h】

Evaluation of serverless computing for scalable execution of a joint variant calling workflow

机译:无服务器计算,用于可扩展执行联合变体调用工作流程的可扩展性计算

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Advances in whole-genome sequencing have greatly reduced the cost and time of obtaining raw genetic information, but the computational requirements of analysis remain a challenge. Serverless computing has emerged as an alternative to using dedicated compute resources, but its utility has not been widely evaluated for standardized genomic workflows. In this study, we define and execute a best-practice joint variant calling workflow using the SWEEP workflow management system. We present an analysis of performance and scalability, and discuss the utility of the serverless paradigm for executing workflows in the field of genomics research. The GATK best-practice short germline joint variant calling pipeline was implemented as a SWEEP workflow comprising 18 tasks. The workflow was executed on Illumina paired-end read samples from the European and African super populations of the 1000 Genomes project phase III. Cost and runtime increased linearly with increasing sample size, although runtime was driven primarily by a single task for larger problem sizes. Execution took a minimum of around 3 hours for 2 samples, up to nearly 13 hours for 62 samples, with costs ranging from $2 to $70.
机译:全基因组测序的进展大大降低了获得原始遗传信息的成本和时间,但分析的计算要求仍然是一个挑战。无操作系统计算已成为使用专用计算资源的替代方案,但其公用资实用程序尚未被广泛评估标准化基因组工作流程。在本研究中,我们使用扫描工作流管理系统定义和执行最佳实践的联合变体调用工作流程。我们展示了对性能和可扩展性的分析,并讨论了无服务范例的实用性,以便在基因组学研究领域执行工作流程。 GATK最佳练习短种系数联合变体调用管道被实施为包括18个任务的扫描工作流程。从1000个基因组项目阶段III的欧洲和非洲超级群体的Illumina成对读取样本上执行工作流程。尽管运行时主要通过单一任务为更大的问题尺寸的单一任务驱动,但运行时的成本和运行时随着样本量的增加而增加。执行至少约3小时3个小时,可达62个样本的近13小时,费用从2美元到70美元。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号