首页> 美国卫生研究院文献>other >elPrep: High-Performance Preparation of Sequence Alignment/Map Files for Variant Calling
【2h】

elPrep: High-Performance Preparation of Sequence Alignment/Map Files for Variant Calling

机译:elPrep:高性能序列比对/映射文件的变体调用

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

elPrep is a high-performance tool for preparing sequence alignment/map files for variant calling in sequencing pipelines. It can be used as a replacement for SAMtools and Picard for preparation steps such as filtering, sorting, marking duplicates, reordering contigs, and so on, while producing identical results. What sets elPrep apart is its software architecture that allows executing preparation pipelines by making only a single pass through the data, no matter how many preparation steps are used in the pipeline. elPrep is designed as a multithreaded application that runs entirely in memory, avoids repeated file I/O, and merges the computation of several preparation steps to significantly speed up the execution time. For example, for a preparation pipeline of five steps on a whole-exome BAM file (NA12878), we reduce the execution time from about 1:40 hours, when using a combination of SAMtools and Picard, to about 15 minutes when using elPrep, while utilising the same server resources, here 48 threads and 23GB of RAM. For the same pipeline on whole-genome data (NA12878), elPrep reduces the runtime from 24 hours to less than 5 hours. As a typical clinical study may contain sequencing data for hundreds of patients, elPrep can remove several hundreds of hours of computing time, and thus substantially reduce analysis time and cost.
机译:elPrep是一个高性能的工具,可为序列管线中的变体调用准备序列比对/图文件。它可以代替SAMtools和Picard进行准备步骤,例如过滤,排序,标记重复项,对重叠群进行重新排序等,同时产生相同的结果。 elPrep的与众不同之处在于其软件体系结构,无论在管道中使用了多少准备步骤,它都只需通过一次数据就可以执行准备管道。 elPrep被设计为完全在内存中运行的多线程应用程序,避免了重复的文件I / O,并且合并了几个准备步骤的计算,从而显着缩短了执行时间。例如,对于整个外显子BAM文件(NA12878)上的五个步骤的准备流程,我们将执行时间从使用SAMtools和Picard组合时的大约1:40小时减少到使用elPrep时的大约15分钟,同时利用相同的服务器资源,这里有48个线程和23GB的RAM。对于全基因组数据(NA12878)上的相同管线,elPrep可以将运行时间从24小时减少到不到5小时。由于典型的临床研究可能包含数百名患者的测序数据,因此elPrep可以节省数百小时的计算时间,从而大大减少了分析时间和成本。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号