首页> 美国卫生研究院文献>PLoS Computational Biology >Neighbor-Dependent Ramachandran Probability Distributions of Amino Acids Developed from a Hierarchical Dirichlet Process Model
【2h】

Neighbor-Dependent Ramachandran Probability Distributions of Amino Acids Developed from a Hierarchical Dirichlet Process Model

机译:从分层狄利克雷过程模型开发的氨基酸的邻居依赖性Ramachandran概率分布

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Distributions of the backbone dihedral angles of proteins have been studied for over 40 years. While many statistical analyses have been presented, only a handful of probability densities are publicly available for use in structure validation and structure prediction methods. The available distributions differ in a number of important ways, which determine their usefulness for various purposes. These include: 1) input data size and criteria for structure inclusion (resolution, R-factor, etc.); 2) filtering of suspect conformations and outliers using B-factors or other features; 3) secondary structure of input data (e.g., whether helix and sheet are included; whether beta turns are included); 4) the method used for determining probability densities ranging from simple histograms to modern nonparametric density estimation; and 5) whether they include nearest neighbor effects on the distribution of conformations in different regions of the Ramachandran map. In this work, Ramachandran probability distributions are presented for residues in protein loops from a high-resolution data set with filtering based on calculated electron densities. Distributions for all 20 amino acids (with cis and trans proline treated separately) have been determined, as well as 420 left-neighbor and 420 right-neighbor dependent distributions. The neighbor-independent and neighbor-dependent probability densities have been accurately estimated using Bayesian nonparametric statistical analysis based on the Dirichlet process. In particular, we used hierarchical Dirichlet process priors, which allow sharing of information between densities for a particular residue type and different neighbor residue types. The resulting distributions are tested in a loop modeling benchmark with the program Rosetta, and are shown to improve protein loop conformation prediction significantly. The distributions are available at .
机译:蛋白质骨架二面角的分布已经研究了40多年。尽管已经提出了许多统计分析,但只有少数几率密度可公开用于结构验证和结构预测方法。可用的发行版在许多重要方面有所不同,这决定了它们对各种目的的有用性。其中包括:1)输入数据的大小和结构包含的标准(分辨率,R因子等); 2)使用B因子或其他特征过滤可疑构象和异常值; 3)输入数据的二级结构(例如,是否包括螺旋和工作表;是否包括beta匝); 4)用于确定概率密度的方法,从简单的直方图到现代的非参数密度估计; 5)它们是否包括Ramachandran地图不同区域中构象分布的最近邻效应。在这项工作中,根据高分辨率的数据集(基于计算出的电子密度进行滤波),给出了蛋白质环中残基的Ramachandran概率分布。已确定所有20个氨基酸的分布(顺式和反式脯氨酸分别处理),以及420个左邻域和420个右邻依赖性分布。使用基于Dirichlet过程的贝叶斯非参数统计分析,已准确估算了独立于邻居和独立于邻居的概率密度。特别是,我们使用了分层的Dirichlet处理先验,它允许在特定残基类型和不同邻域残基类型的密度之间共享信息。使用程序Rosetta在循环建模基准中测试了所得的分布,并显示可以显着改善蛋白质环构象预测。可以从下载。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号