...
首页> 外文期刊>BMC Bioinformatics >Toward an accurate prediction of inter-residue distances in proteins using 2D recursive neural networks
【24h】

Toward an accurate prediction of inter-residue distances in proteins using 2D recursive neural networks

机译:使用2D递归神经网络准确预测蛋白质中残基间的距离

获取原文
           

摘要

Background Protein inter-residue contact maps provide a translation and rotation invariant topological representation of a protein. They can be used as an intermediary step in protein structure predictions. However, the prediction of contact maps represents an unbalanced problem as far fewer examples of contacts than non-contacts exist in a protein structure. In this study we explore the possibility of completely eliminating the unbalanced nature of the contact map prediction problem by predicting real-value distances between residues. Predicting full inter-residue distance maps and applying them in protein structure predictions has been relatively unexplored in the past. Results We initially demonstrate that the use of native-like distance maps is able to reproduce 3D structures almost identical to the targets, giving an average RMSD of 0.5?. In addition, the corrupted physical maps with an introduced random error of ±6? are able to reconstruct the targets within an average RMSD of 2?. After demonstrating the reconstruction potential of distance maps, we develop two classes of predictors using two-dimensional recursive neural networks: an ab initio predictor that relies only on the protein sequence and evolutionary information, and a template-based predictor in which additional structural homology information is provided. We find that the ab initio predictor is able to reproduce distances with an RMSD of 6?, regardless of the evolutionary content provided. Furthermore, we show that the template-based predictor exploits both sequence and structure information even in cases of dubious homology and outperforms the best template hit with a clear margin of up to 3.7?. Lastly, we demonstrate the ability of the two predictors to reconstruct the CASP9 targets shorter than 200 residues producing the results similar to the state of the machine learning art approach implemented in the Distill server. Conclusions The methodology presented here, if complemented by more complex reconstruction protocols, can represent a possible path to improve machine learning algorithms for 3D protein structure prediction. Moreover, it can be used as an intermediary step in protein structure predictions either on its own or complemented by NMR restraints.
机译:背景蛋白质残基间接触图提供了蛋白质的翻译和旋转不变拓扑表示。它们可用作蛋白质结构预测中的中间步骤。但是,接触图的预测代表了一个不平衡的问题,因为蛋白质结构中存在的接触实例少于非接触实例。在这项研究中,我们探索了通过预测残基之间的实际值距离来完全消除接触图预测问题的不平衡性质的可能性。在过去,相对完整的残基间距离图预测和将其应用于蛋白质结构预测中的研究还相对较少。结果我们最初证明,使用类似本机的距离图可以再现几乎与目标相同的3D结构,平均RMSD为0.5?。此外,损坏的物理图具有±6?的引入随机误差。能够在平均RMSD 2之内重建目标。在演示了距离图的重建潜力之后,我们使用二维递归神经网络开发了两类预测器:从头开始的预测器,其仅依赖于蛋白质序列和进化信息;以及基于模板的预测器,其中包含其他结构同源性信息提供。我们发现,从头算预测器能够复制RMSD为6?的距离,而与所提供的进化内容无关。此外,我们表明即使在可疑同源性的情况下,基于模板的预测子也能利用序列和结构信息,并以3.7的明显余量胜过最佳模板命中。最后,我们证明了这两个预测变量重建少于200个残基的CASP9目标的能力,产生的结果类似于Distill服务器中实现的机器学习技术方法的状态。结论此处提出的方法,如果辅以更复杂的重建方案,则可能代表一条改善3D蛋白质结构预测机器学习算法的途径。此外,它可以单独或通过NMR限制用作蛋白质结构预测中的中间步骤。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号