首页> 外文期刊>Nucleic Acids Research >A supersecondary structure library and search algorithm for modeling loops in protein structures
【24h】

A supersecondary structure library and search algorithm for modeling loops in protein structures

机译:用于蛋白质结构循环建模的超二级结构库和搜索算法

获取原文
获取原文并翻译 | 示例
           

摘要

We present a fragment-search based method for predicting loop conformations in protein models. A hierarchical and multidimensional database has been set up that currently classifies 105 950 loop fragments and loop flanking secondary structures. Besides the length of the loops and types of bracing secondary structures the database is organized along four internal coordinates, a distance and three types of angles characterizing the geometry of stem regions. Candidate fragments are selected from this library by matching the length, the types of bracing secondary structures of the query and satisfying the geometrical restraints of the stems and subsequently inserted in the query protein framework where their fit is assessed by the root mean square deviation (r.m.s.d.) of stem regions and by the number of rigid body clashes with the environment. In the final step remaining candidate loops are ranked by a Z-score that combines information on sequence similarity and fit of predicted and observed phi/psi main chain dihedral angle propensities. Confidence Z-score cut-offs were determined for each loop length that identify those predicted fragments that outperform a competitive ab initio method. A web server implements the method, regularly updates the fragment library and performs prediction. Predicted segments are returned, or optionally, these can be completed with side chain reconstruction and subsequently annealed in the environment of the query protein by conjugate gradient minimization. The prediction method was tested on artificially prepared search datasets where all trivial sequence similarities on the SCOP superfamily level were removed. Under these conditions it is possible to predict loops of length 4, 8 and 12 with coverage of 98, 78 and 28% with at least of 0.22, 1.38 and 2.47 angstrom of r.m.s.d. accuracy, respectively. In a head-to-head comparison on loops extracted from freshly deposited new protein folds the current method outperformed in a similar to 5:1 ratio an earlier developed database search method.
机译:我们提出了一种基于片段搜索的方法来预测蛋白质模型中的环构象。已经建立了一个分层的多维数据库,该数据库当前对105 950个循环片段和循环侧翼二级结构进行分类。除了环的长度和支撑二级结构的类型外,数据库还沿着四个内部坐标,一个距离和三种类型的角度(代表茎区域的几何形状)进行组织。通过匹配长度,查询的二级结构支撑类型以及满足茎的几何约束条件,从该库中选择候选片段,然后将其插入到查询蛋白质框架中,在该框架中通过均方根偏差(rmsd )茎区域以及通过刚体与环境的碰撞次数。在最后一步中,剩余的候选环由Z分数排序,该Z分数结合了有关序列相似性以及预测和观察到的phi / psi主链二面角倾向的拟合信息。确定每个循环长度的置信度Z分数临界值,以识别那些预测的片段优于竞争性从头算方法的片段。 Web服务器实施该方法,定期更新片段库并执行预测。返回预测的节段,或者可选地,可以用侧链重建完成这些节段,然后通过共轭梯度最小化在查询蛋白的环境中进行退火。在人工准备的搜索数据集上测试了该预测方法,该数据集删除了SCOP超家族一级的所有琐碎序列相似性。在这些条件下,有可能预测长度为4、8和12的线圈,覆盖率分别为98、78和28%,r.m.s.d至少为0.22、1.38和2.47埃。准确性。在从新鲜沉积的新蛋白质折叠中提取的环上进行的头对头比较中,当前方法的性能比早期开发的数据库搜索方法高出约5:1。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号