首页> 外文期刊>Nucleic Acids Research >Mastering seeds for genomic size nucleotide BLAST searches
【24h】

Mastering seeds for genomic size nucleotide BLAST searches

机译:掌握基因组大小核苷酸BLAST搜索的种子

获取原文
获取原文并翻译 | 示例
           

摘要

One of the most common activities in bioinformatics is the search for similar sequences. These searches are usually carried out with the help of programs from the NCBI BLAST family. As the majority of searches are routinely performed with default parameters, a question that should be addressed is how reliable the results obtained using the default parameter values are, i.e. what fraction of potential matches have been retrieved by these searches. Our primary focus is on the initial hit parameter, also known as the seed or word, used by the NCBI BLASTn, MegaBLAST and other similar programs in searches for similar nucleotide sequences. We show that the use of default values for the initial hit parameter can have a big negative impact on the proportion of potentially similar sequences that are retrieved. We also show how the hit probability of different seeds varies with the minimum length and similarity of sequences desired to be retrieved and describe methods that help in determining appropriate seeds. The experimental results described in this paper illustrate situations in which these methods are most applicable and also show the relationship between the various BLAST parameters.
机译:生物信息学中最常见的活动之一是寻找相似序列。这些搜索通常在NCBI BLAST家族的程序的帮助下进行。由于大多数搜索都是常规地使用默认参数执行的,因此应解决的问题是使用默认参数值获得的结果的可靠性如何,即这些搜索已获取了潜在匹配的分数。我们的主要重点是NCBI BLASTn,MegaBLAST和其他类似程序在搜索相似核苷酸序列时使用的初始命中参数,也称为种子或单词。我们表明,对初始命中参数使用默认值会对检索到的潜在相似序列的比例产生很大的负面影响。我们还展示了不同种子的命中率如何随所需检索的最小长度和相似性而变化,并描述了有助于确定合适种子的方法。本文描述的实验结果说明了这些方法最适用的情况,并显示了各种BLAST参数之间的关系。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号