首页> 美国卫生研究院文献>PLoS Computational Biology >Automatic Prediction of Protein 3D Structures by Probabilistic Multi-template Homology Modeling
【2h】

Automatic Prediction of Protein 3D Structures by Probabilistic Multi-template Homology Modeling

机译:通过概率多模板同源性建模自动预测蛋白质3D结构

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Homology modeling predicts the 3D structure of a query protein based on the sequence alignment with one or more template proteins of known structure. Its great importance for biological research is owed to its speed, simplicity, reliability and wide applicability, covering more than half of the residues in protein sequence space. Although multiple templates have been shown to generally increase model quality over single templates, the information from multiple templates has so far been combined using empirically motivated, heuristic approaches.We present here a rigorous statistical framework for multi-template homology modeling. First, we find that the query proteins’ atomic distance restraints can be accurately described by two-component Gaussian mixtures. This insight allowed us to apply the standard laws of probability theory to combine restraints from multiple templates. Second, we derive theoretically optimal weights to correct for the redundancy among related templates. Third, a heuristic template selection strategy is proposed.We improve the average GDT-ha model quality score by 11% over single template modeling and by 6.5% over a conventional multi-template approach on a set of 1000 query proteins. Robustness with respect to wrong constraints is likewise improved. We have integrated our multi-template modeling approach with the popular MODELLER homology modeling software in our free HHpred server and also offer open source software for running MODELLER with the new restraints at .
机译:同源性建模基于与已知结构的一种或多种模板蛋白的序列比对来预测查询蛋白的3D结构。它对生物学研究的重要意义在于它的速度,简单性,可靠性和广泛的适用性,涵盖了蛋白质序列空间中一半以上的残基。尽管已显示多个模板通常可以提高单个模板的模型质量,但到目前为止,已使用基于经验的启发式方法将来自多个模板的信息进行了组合。在此,我们为多模板同源性建模提供了严格的统计框架。首先,我们发现查询蛋白质的原子距离约束可以通过两组分高斯混合来准确描述。这种见解使我们能够应用概率论的标准定律来组合来自多个模板的约束。其次,我们推导出理论上最佳的权重以校正相关模板之间的冗余。第三,提出了一种启发式模板选择策略,针对1000个查询蛋白,我们将GDT-ha模型的平均质量得分比单模板模型提高了11%,比常规多模板方法提高了6.5%。关于错误约束的鲁棒性同样得到改善。我们已在免费的HHpred服务器中将我们的多模板建模方法与流行的MODELLER同源性建模软件集成在一起,并且还提供了用于运行MODELLER并带有的新限制的开源软件。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号