【24h】

Reconstruction of 3D Structures from Protein Contact Maps

机译:从蛋白质接触图重建3D结构

获取原文
获取原文并翻译 | 示例

摘要

Proteins are large organic compounds made of amino acids arranged in a linear chain (primary structure). Most proteins fold into unique three-dimensional (3D) structures called interchangeably tertiary, folded, or native structures. Discovering the tertiary structure of a protein (Protein Folding Problem) can provide important clues about how the protein performs its function and it is one of the most important problems in Bioinformatics. A contact map of a given protein P is a binary matrix M such that M_(i,j) = 1 iff the physical distance between amino acids I and j in the native structure is less than or equal to a pre-assigned threshold t. The contact map of each protein is a distinctive signature of its folded structure. Predicting the tertiary structure of a protein directly from its primary structure is a very complex and still unsolved problem. An alternative and probably more feasible approach is to predict the contact map of a protein from its primary structure and then to compute the tertiary structure starting from the predicted contact map. This last problem has been recently proven to be NP-Hard [6]. In this paper we give a heuristic method that is able to reconstruct in a few seconds a 3D model that exactly matches the target contact map. We wish to emphasize that our method computes an exact model for the protein independently of the contact map threshold. To our knowledge, our method outperforms all other techniques in the literature [5,10,17,19] both for the quality of the provided solutions and for the running times. Our experimental results are obtained on a non-redundant data set consisting of 1760 proteins which is by far the largest benchmark set used so far. Average running times range from 3 to 15 seconds depending on the contact map threshold and on the size of the protein. Repeated applications of our method (starting from randomly chosen distinct initial solutions) show that the same contact map may admit (depending on the threshold) quite different 3D models. Extensive experimental results show that contact map thresholds ranging from 10 to 18 Angstrom allow to reconstruct 3D models that are very similar to the proteins native structure. Our Heuristic is freely available for testing on the web at the following url: http://vassura.web.cs.unibo.it/cmap23d/.
机译:蛋白质是由排列成线性链(一级结构)的氨基酸制成的大型有机化合物。大多数蛋白质折叠成独特的三维(3D)结构,称为可互换的三级,折叠或天然结构。发现蛋白质的三级结构(蛋白质折叠问题)可以提供有关蛋白质如何执行其功能的重要线索,这是生物信息学中最重要的问题之一。给定蛋白质P的接触图是二元矩阵M,如果天然结构中氨基酸I和j之间的物理距离小于或等于预先指定的阈值t,则M_(i,j)= 1。每种蛋白质的接触图是其折叠结构的独特标志。直接从其一级结构预测蛋白质的三级结构是一个非常复杂且尚未解决的问题。另一种可能更可行的方法是从蛋白质的一级结构预测蛋白质的接触图,然后从预测的接触图开始计算三级结构。最近的最后一个问题被证明是NP-Hard [6]。在本文中,我们提供了一种启发式方法,该方法能够在几秒钟内重建与目标联系图完全匹配的3D模型。我们希望强调的是,我们的方法可以独立于接触图阈值来计算蛋白质的精确模型。据我们所知,我们的方法在提供的解决方案的质量和运行时间方面都优于文献[5,10,17,19]中的所有其他技术。我们的实验结果是在包含1760种蛋白质的非冗余数据集上获得的,这是迄今为止使用的最大基准集。平均运行时间为3到15秒,具体取决于接触图阈值和蛋白质的大小。我们方法的重复应用(从随机选择的不同初始解开始)表明,相同的接触图可能(取决于阈值)承认完全不同的3D模型。大量的实验结果表明,接触图阈值范围从10到18埃,可以重建与蛋白质天然结构非常相似的3D模型。我们的启发式软件可通过以下URL免费在Web上进行测试:http://vassura.web.cs.unibo.it/cmap23d/。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号