首页> 外文期刊>Standards in Genomic Sciences >BLAST-QC: automated analysis of BLAST results
【24h】

BLAST-QC: automated analysis of BLAST results

机译:BLAST-QC:爆炸效果的自动分析

获取原文
获取外文期刊封面目录资料

摘要

The Basic Local Alignment Search Tool (BLAST) from NCBI is the preferred utility for sequence alignment and identification for bioinformatics and genomics research. Among researchers using NCBI’s BLAST software, it is well known that analyzing the results of a large BLAST search can be tedious and time-consuming. Furthermore, with the recent discussions over the effects of parameters such as ‘-max_target_seqs’ on the BLAST heuristic search process, the use of these search options are questionable. This leaves using a stand-alone parser as one of the only options of condensing these large datasets, and with few available for download online, the task is left to the researcher to create a specialized piece of software anytime they need to analyze BLAST results. The need for a streamlined and fast script that solves these issues and can be easily implemented into a variety of bioinformatics and genomics workflows was the initial motivation for developing this software. In this study, we demonstrate the effectiveness of BLAST-QC for analysis of BLAST results and its desirability over the other available options. Applying genetic sequence data from our bioinformatic workflows, we establish BLAST_QC’s superior runtime when compared to existing parsers developed with commonly used BioPerl and BioPython modules, as well as C and Java implementations of the BLAST_QC program. We discuss the ‘max_target_seqs’ parameter, the usage of and controversy around the use of the parameter, and offer a solution by demonstrating the ability of our software to provide the functionality this parameter was assumed to produce, as well as a variety of other parsing options. Executions of the script on example datasets are given, demonstrating the implemented functionality and providing test-cases of the program. BLAST-QC is designed to be integrated into existing software, and we establish its effectiveness as a module of workflows or other processes. BLAST-QC provides the community with a simple, lightweight and portable Python script that allows for easy quality control of BLAST results while avoiding the drawbacks of other options. This includes the uncertain results of applying the -max_target_seqs parameter or relying on the cumbersome dependencies of other options like BioPerl, Java, etc. which add complexity and run time when running large data sets of sequences. BLAST-QC is ideal for use in high-throughput workflows and pipelines common in bioinformatic and genomic research, and the script has been designed for portability and easy integration into whatever type of processes the user may be running.
机译:来自NCBI的基本局部对准搜索工具(BLAST)是用于生物信息学和基因组学研究的序列对准和识别的首选效用。在使用NCBI的爆炸软件的研究人员中,众所周知,分析大型爆炸搜索的结果可能是乏味且耗时的。此外,随着最近对爆炸启发式搜索过程中的参数诸如“-max_target_seqs”的效果的讨论,使用这些搜索选项是可疑的。这将使用独立的解析器作为凝结这些大型数据集的唯一选项之一,并且只需在线下载即可,“任务”留给研究人员,随时创建专业的软件,随时他们需要分析爆炸结果。需要一种简化的和快速脚本,可以解决这些问题,并且可以轻松地实现为各种生物信息学和基因组学工作流程是开发该软件的初步动机。在这项研究中,我们证明了BLAST-QC对爆炸结果分析的有效性及其对其他可用选择的可取性。从我们的生物信息工作流程应用遗传序列数据,与使用常用的BioPerl和Biopython模块开发的现有解析器以及BLAST_QC程序的C和Java实现相比,我们建立Blast_QC的卓越运行时间。我们讨论“max_target_seqs”参数,使用参数的使用和争议,并通过演示我们的软件提供功能的能力来提供解决方案,该参数被假定为生产,以及各种其他解析选项。给出了示例数据集的脚本的执行,演示了实现的功能和提供程序的测试用例。 BLAST-QC旨在集成到现有软件中,并将其作为工作流或其他流程的模块建立其有效性。 BLAST-QC提供了一个简单,轻便便携的Python脚本的社区,可以轻松地控制爆炸结果,同时避免其他选项的缺点。这包括应用-max_target_seqs参数的不确定结果,或者依赖于其他选项的繁琐依赖性,如BioPerl,Java等等,在运行大数据集的序列时添加复杂性和运行时间。 BLAST-QC是用于生物信息和基因组研究中共同的高通量工作流程和管道的理想选择,该脚本专为便携性和容易地集成到用户可能正在运行的任何类型的过程中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号