首页> 美国卫生研究院文献>Biophysical Journal >Selecting High Quality Protein Structures from Diverse Conformational Ensembles
【2h】

Selecting High Quality Protein Structures from Diverse Conformational Ensembles

机译:从不同构象集合中选择高质量的蛋白质结构

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Protein structure prediction encompasses two major challenges: 1), the generation of a large ensemble of high resolution structures for a given amino-acid sequence; and 2), the identification of the structure closest to the native structure for a blind prediction. In this article, we address the second challenge, by proposing what is, to our knowledge, a novel iterative traveling-salesman problem-based clustering method to identify the structures of a protein, in a given ensemble, which are closest to the native structure. The method consists of an iterative procedure, which aims at eliminating clusters of structures at each iteration, which are unlikely to be of similar fold to the native, based on a statistical analysis of cluster density and average spherical radius. The method, denoted as ICON, has been tested on four data sets: 1), 1400 proteins with high resolution decoys; 2), medium-to-low resolution decoys from Decoys ‘R’ Us; 3), medium-to-low resolution decoys from the first-principles approach, ASTRO-FOLD; and 4), selected targets from CASP8. The extensive tests demonstrate that ICON can identify high-quality structures in each ensemble, regardless of the resolution of conformers. In a total of 1454 proteins, with an average of 1051 conformers per protein, the conformers selected by ICON are, on an average, in the top 3.5% of the conformers in the ensemble.
机译:蛋白质结构预测包含两个主要挑战:1)对于给定的氨基酸序列,要生成高分辨率结构的大型整体; 2)识别最接近原始结构的结构,以进行盲预测。在本文中,我们将提出第二种挑战,方法是,根据我们的知识,提出一种新颖的基于旅行推销员问题的迭代聚类方法,以在给定的集合中识别最接近天然结构的蛋白质结构。该方法由一个迭代过程组成,该过程旨在基于对簇密度和平均球半径的统计分析,消除每次迭代中结构的簇,这些簇不太可能与天然簇具有相似的折叠倍数。该方法以ICON表示,已在四个数据集上进行了测试:1)具有高分辨率诱饵的1400种蛋白质; 2),来自诱饵“ R”我们的中低分辨率诱饵; 3),从第一原理方法ASTRO-FOLD引出的中低分辨率诱饵;和4),从CASP8中选择目标。广泛的测试表明,ICON可以识别每个集合中的高质量结构,而不考虑构象异构体的分辨率。在总共1454个蛋白质中,每个蛋白质平均包含1051个构象异构体,由ICON选择的构象异构体平均平均占集合体中排名靠前的3.5%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号