...
首页> 外文期刊>Bioinformatics >Comparative analysis of algorithms for next-generation sequencing read alignment
【24h】

Comparative analysis of algorithms for next-generation sequencing read alignment

机译:下一代测序阅读比对算法的比较分析

获取原文
获取原文并翻译 | 示例

摘要

Motivation: The advent of next-generation sequencing (NGS) techniques presents many novel opportunities for many applications in life sciences. The vast number of short reads produced by these techniques, however, pose significant computational challenges. The first step in many types of genomic analysis is the mapping of short reads to a reference genome, and several groups have developed dedicated algorithms and software packages to perform this function. As the developers of these packages optimize their algorithms with respect to various considerations, the relative merits of different software packages remain unclear. However, for scientists who generate and use NGS data for their specific research projects, an important consideration is choosing the software that is most suitable for their application.Results: With a view to comparing existing short read alignment software, we develop a simulation and evaluation suite, Seal, which simulates NGS runs for different configurations of various factors, including sequencing error, indels and coverage. We also develop criteria to compare the performances of software with disparate output structure (e.g. some packages return a single alignment while some return multiple possible alignments). Using these criteria, we comprehensively evaluate the performances of Bowtie, BWA, mr- and mrsFAST, Novoalign, SHRiMP and SOAPv2, with regard to accuracy and runtime.Conclusion: We expect that the results presented here will be useful to investigators in choosing the alignment software that is most suitable for their specific research aims. Our results also provide insights into the factors that should be considered to use alignment results effectively. Seal can also be used to evaluate the performance of algorithms that use deep sequencing data for various purposes (e.g. identification of genomic variants).
机译:动机:下一代测序(NGS)技术的出现为生命科学的许多应用提供了许多新颖的机会。然而,这些技术产生的大量短读带来了重大的计算挑战。许多类型的基因组分析的第一步是将短读段映射到参考基因组,并且几个小组已经开发了专用算法和软件包来执行此功能。当这些软件包的开发者针对各种考虑优化其算法时,不同软件包的相对优点仍然不清楚。但是,对于为特定研究项目生成和使用NGS数据的科学家,重要的考虑因素是选择最适合其应用的软件。结果:为了比较现有的短读比对软件,我们进行了仿真和评估套件,Seal,可模拟NGS在各种因素的不同配置下运行,包括测序错误,插入缺失和覆盖率。我们还制定了标准来比较具有不同输出结构的软件的性能(例如,某些软件包返回单个对齐方式,而某些返回多个可能的对齐方式)。使用这些标准,我们在准确性和运行时间方面全面评估了Bowtie,BWA,mr-和mrsFAST,Novoalign,SHRiMP和SOAPv2的性能。最适合其特定研究目标的软件。我们的结果还提供了对有效使用比对结果应考虑的因素的见解。 Seal还可以用于评估出于各种目的使用深度测序数据的算法的性能(例如,鉴定基因组变体)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号