首页> 外文期刊>BMC Bioinformatics >High quality protein sequence alignment by combining structural profile prediction and profile alignment using SABERTOOTH
【24h】

High quality protein sequence alignment by combining structural profile prediction and profile alignment using SABERTOOTH

机译:通过结合结构轮廓预测和使用SABERTOOTH的轮廓比对实现高质量的蛋白质序列比对

获取原文
           

摘要

Background Protein alignments are an essential tool for many bioinformatics analyses. While sequence alignments are accurate for proteins of high sequence similarity, they become unreliable as they approach the so-called 'twilight zone' where sequence similarity gets indistinguishable from random. For such distant pairs, structure alignment is of much better quality. Nevertheless, sequence alignment is the only choice in the majority of cases where structural data is not available. This situation demands development of methods that extend the applicability of accurate sequence alignment to distantly related proteins. Results We develop a sequence alignment method that combines the prediction of a structural profile based on the protein's sequence with the alignment of that profile using our recently published alignment tool SABERTOOTH. In particular, we predict the contact vector of protein structures using an artificial neural network based on position-specific scoring matrices generated by PSI-BLAST and align these predicted contact vectors. The resulting sequence alignments are assessed using two different tests: First, we assess the alignment quality by measuring the derived structural similarity for cases in which structures are available. In a second test, we quantify the ability of the significance score of the alignments to recognize structural and evolutionary relationships. As a benchmark we use a representative set of the SCOP (structural classification of proteins) database, with similarities ranging from closely related proteins at SCOP family level, to very distantly related proteins at SCOP fold level. Comparing these results with some prominent sequence alignment tools, we find that SABERTOOTH produces sequence alignments of better quality than those of Clustal W, T-Coffee, MUSCLE, and PSI-BLAST. HHpred, one of the most sophisticated and computationally expensive tools available, outperforms our alignment algorithm at family and superfamily levels, while the use of SABERTOOTH is advantageous for alignments at fold level. Our alignment scheme will profit from future improvements of structural profiles prediction. Conclusions We present the automatic sequence alignment tool SABERTOOTH that computes pairwise sequence alignments of very high quality. SABERTOOTH is especially advantageous when applied to alignments of remotely related proteins. The source code is available at http://www.fkp.tu-darmstadt.de/sabertooth_project/ , free for academic users upon request.
机译:背景蛋白质比对是许多生物信息学分析的重要工具。虽然序列比对对于具有高序列相似性的蛋白质是准确的,但是当它们接近所谓的“暮光区”时,它们变得不可靠,在该区域中,序列相似性与随机性变得难以区分。对于这样的远距离对,结构对齐的质量要好得多。然而,在大多数情况下,如果没有结构数据,序列比对是唯一的选择。这种情况要求开发将精确序列比对的适用性扩展到远距离相关蛋白质的方法。结果我们开发了一种序列比对方法,该方法将结合基于蛋白质序列的结构图谱的预测与使用我们最近发布的比对工具SABERTOOTH的该图谱的比对相结合。特别是,我们基于PSI-BLAST生成的特定位置评分矩阵,使用人工神经网络预测蛋白质结构的接触载体,并对齐这些预测的接触载体。使用两种不同的测试来评估所得的序列比对:首先,我们通过测量在可用结构的情况下得出的结构相似性来评估比对质量。在第二个测试中,我们量化了比对的显着性分数识别结构和进化关系的能力。作为基准,我们使用SCOP(蛋白质的结构分类)数据库的代表集,其相似性从SCOP家族水平的密切相关蛋白到SCOP折叠水平的非常远相关的蛋白。将这些结果与一些著名的序列比对工具进行比较,我们发现SABERTOOTH产生的序列比对质量比Clustal W,T-Coffee,MUSCLE和PSI-BLAST更好。 HHpred是目前可用的最复杂且计算最昂贵的工具之一,在家族和超家族级别上优于我们的比对算法,而SABERTOOTH的使用对于折叠级别上的比对是有利的。我们的对中方案将从结构轮廓预测的未来改进中受益。结论我们介绍了一种自动序列比对工具SABERTOOTH,该工具可以计算非常高质量的成对序列比对。当将SABERTOOTH应用于远程相关蛋白的比对时,尤其有利。可从http://www.fkp.tu-darmstadt.de/sabertooth_project/获得源代码,学术用户可根据要求免费获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号