首页> 美国卫生研究院文献>other >Distance Matrix-Based Approach to Protein Structure Prediction
【2h】

Distance Matrix-Based Approach to Protein Structure Prediction

机译:基于距离矩阵的蛋白质结构预测方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Much structural information is encoded in the internal distances; a distance matrix-based approach can be used to predict protein structure and dynamics, and for structural refinement. Our approach is based on the square distance matrix >D = [rij2] containing all square distances between residues in proteins. This distance matrix contains more information than the contact matrix >C, that has elements of either 0 or 1 depending on whether the distance rij is greater or less than a cutoff value rcutoff .We have performed spectral decomposition of the distance matrices D=λkVkVkT , in terms of eigenvalues λk and the corresponding eigenvectors >vk and found that it contains at most 5 nonzero terms. A dominant eigenvector is proportional to r2 - the square distance of points from the center of mass, with the next three being the principal components of the system of points. By knowing r2 we can approximate a distance matrix of a protein with an expected RMSD value of about 4.5Å. We can also explain the role of hydrophobic interactions for the protein structure, because r is highly correlated with the hydrophobic profile of the sequence. Moreover, r is highly correlated with several sequence profiles which are useful in protein structure prediction, such as contact number, the residue-wise contact order (RWCO) or mean square fluctuations (i.e. crystallographic temperature factors). We have also shown that the next three components are related to spatial directionality of the secondary structure elements, and they may be also predicted from the sequence, improving overall structure prediction. We have also shown that the large number of available HIV-1 protease structures provides a remarkable sampling of conformations, which can be viewed as direct structural information about the dynamics. After structure matching, we apply principal component analysis (PCA) to obtain the important apparent motions for both bound and unbound structures. There are significant similarities between the first few key motions and the first few low-frequency normal modes calculated from a static representative structure with an elastic network model (ENM) that is based on the contact matrix >C (related to >D), strongly suggesting that the variations among the observed structures and the corresponding conformational changes are facilitated by the low-frequency, global motions intrinsic to the structure. Similarities are also found when the approach is applied to an NMR ensemble, as well as to atomic molecular dynamics (MD) trajectories. Thus, a sufficiently large number of experimental structures can directly provide important information about protein dynamics, but ENM can also provide a similar sampling of conformations. Finally, we use distance constraints from databases of known protein structures for structure refinement. We use the distributions of distances of various types in known protein structures to obtain the most probable ranges or the mean-force potentials for the distances. We then impose these constraints on structures to be refined or include the mean-force potentials directly in the energy minimization so that more plausible structural models can be built. This approach has been successfully used by us in 2006 in the CASPR structure refinement ().
机译:许多结构信息都编码在内部距离中。基于距离矩阵的方法可用于预测蛋白质结构和动力学,以及用于结构优化。我们的方法基于包含蛋白质中残基之间所有平方距离的平方距离矩阵> D = [rij 2 ]。该距离矩阵比接触矩阵> C 包含更多的信息,接触矩阵的元素为0或1,具体取决于距离rij是大于还是小于截止值rcutoff。距离矩阵 D = λ k < mi mathvariant =“ bold”> V k V k T ,根据特征值λk和相应的特征向量> v k,发现它最多包含5个非零项。主导特征向量与r 2 成比例-点距质心的平方距离,后三个是点系的主要成分。通过知道r 2 ,我们可以近似估计RMSD值约为4.5Å的蛋白质的距离矩阵。我们还可以解释疏水相互作用对蛋白质结构的作用,因为r与序列的疏水特征高度相关。此外,r与可用于蛋白质结构预测的几个序列概况高度相关,例如接触数,残基接触顺序(RWCO)或均方波动(即晶体学温度因子)。我们还表明,接下来的三个组件与二级结构元素的空间方向性相关,并且它们也可以从序列中进行预测,从而改善了整体结构的预测。我们还表明,大量可用的HIV-1蛋白酶结构提供了显着的构象采样,可以将其视为有关动力学的直接结构信息。在结构匹配之后,我们应用主成分分析(PCA)获得绑定和未绑定结构的重要表观运动。前几个关键动作与前几个低频正常模式之间存在显着相似性,该低频正常模式是使用基于接触矩阵> C 的弹性网络模型(ENM)的静态代表结构计算的(相关(> D ),强烈暗示观察到的结构之间的差异和相应的构象变化是由于结构固有的低频全局运动而引起的。当将该方法应用于NMR集合以及原子分子动力学(MD)轨迹时,也会发现相似之处。因此,足够多的实验结构可以直接提供有关蛋白质动力学的重要信息,但是ENM也可以提供类似的构象采样。最后,我们使用来自已知蛋白质结构数据库的距离约束来进行结构优化。我们使用已知蛋白质结构中各种类型的距离分布来获得距离的最可能范围或平均力势。然后,我们将这些约束施加于待精炼的结构上,或将平均力势直接纳入能量最小化中,以便可以建立更合理的结构模型。我们已在2006年成功地将这种方法用于CASPR的结构优化()。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号