首页> 外文期刊>Bioinformatics >Powerful fusion: PSI-BLAST and consensus sequences
【24h】

Powerful fusion: PSI-BLAST and consensus sequences

机译:强大的融合:PSI-BLAST和共有序列

获取原文
获取原文并翻译 | 示例
       

摘要

Motivation: A typical PSI-BLAST search consists of iterative scanning and alignment of a large sequence database during which a scoring profile is progressively built and refined. Such a profile can also be stored and used to search against a different database of sequences. Using it to search against a database of consensus rather than native sequences is a simple add-on that boosts performance surprisingly well. The improvement comes at a price: we hypothesized that random alignment score statistics would differ between native and consensus sequences. Thus PSI-BLAST-based profile searches against consensus sequences might incorrectly estimate statistical significance of alignment scores. In addition, iterative searches against consensus databases may fail. Here, we addressed these challenges in an attempt to harness the full power of the combination of PSI-BLAST and consensus sequences.Results: We studied alignment score statistics for various types of consensus sequences. In general, the score distribution parameters of profile-based consensus sequence alignments differed significantly from those derived for the native sequences. PSI-BLAST partially compensated for the parameter variation. We have identified a protocol for building specialized consensus sequences that significantly improved search sensitivity and preserved score distribution parameters. As a result, PSI-BLAST profiles can be used to search specialized consensus sequences without sacrificing estimates of statistical significance. We also provided results indicating that iterative PSI-BLAST searches against consensus sequences could work very well. Overall, we showed how a very popular and effective method could be used to identify significantly more relevant similarities among protein sequences.
机译:动机:典型的PSI-BLAST搜索包括对大型序列数据库的迭代扫描和比对,在此过程中逐步建立和完善评分配置文件。这样的简档也可以被存储并用于针对不同的序列数据库进行搜索。使用它来搜索共有序列而不是本地序列的数据库是一个简单的附加组件,可以惊人地提高性能。改善是有代价的:我们假设随机比对得分统计数据在天然序列和共有序列之间会有所不同。因此,针对共有序列的基于PSI-BLAST的配置文件搜索可能会错误地估计比对得分的统计意义。此外,针对共识数据库进行的迭代搜索可能会失败。在这里,我们解决了这些挑战,试图充分利用PSI-BLAST和共有序列的组合。结果:我们研究了各种共有序列的比对得分统计。通常,基于谱的共有序列比对的得分分布参数与从天然序列推导的得分分布参数显着不同。 PSI-BLAST部分补偿了参数变化。我们已经确定了用于构建专门共识序列的协议,该协议可显着提高搜索灵敏度并保留分数分布参数。结果,在不牺牲统计显着性估计的情况下,PSI-BLAST谱可用于搜索专门的共有序列。我们还提供了表明针对共有序列的迭代PSI-BLAST搜索效果很好的结果。总的来说,我们展示了如何使用一种非常流行且有效的方法来鉴定蛋白质序列之间明显更相关的相似性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号