首页> 外文会议>Annual International Conference on Research in Computational Molecular Biology >Minimizing and Learning Energy Functions for Side-Chain Prediction
【24h】

Minimizing and Learning Energy Functions for Side-Chain Prediction

机译:最小化和学习侧链预测的能量功能

获取原文

摘要

Side-chain prediction is an important subproblem of the general protein folding problem. Despite much progress in side-chain prediction, performance is far from satisfactory. As an example, the ROSETTA program that uses simulated annealing to select the minimum energy conformations, correctly predicts the first two side-chain angles for approximately 72% of the buried residues in a standard data set. Is further improvement more likely to come from better search methods, or from better energy functions? Given that exact minimization of the energy is NP hard, it is difficult to get a systematic answer to this question. In this paper, we present a novel search method and a novel method for learning energy functions from training data that are both based on Tree Reweighted Belief Propagation (TRBP). We find that TRBP can find the global optimum of the ROSETTA energy function in a few minutes of computation for approximately 85% of the proteins in a standard benchmark set. TRBP can also effectively bound the partition function which enables using the Conditional Random Fields (CRF) framework for learning. Interestingly, finding the global minimum does not significantly improve side-chain prediction for an energy function based on ROSETTA's default energy terms (less than 0.1%), while learning new weights gives a significant boost from 72% to 78%. Using a recently modified ROSETTA energy function with a softer Lennard-Jones repulsive term, the global optimum does improve prediction accuracy from 77% to78%. Here again, learning new weights improves side-chain modeling even further to 80%. Finally, the highest accuracy (82.6%) is obtained using an extended rotamer library and CRF learned weights. Our results suggest that combining machine learning withapproximate inference can improve the state-of-the-art in side-chain prediction.
机译:侧链预测是一般蛋白质折叠问题的重要亚数。尽管侧链预测有很大进展,但性能远非令人满意。作为示例,使用模拟退火以选择最小能量构象的ROSETTTA程序正确地预测了标准数据集中大约72%的埋地残留物的前两个侧链角度。进一步改进更有可能来自更好的搜索方法,或者从更好的能量功能鉴于Energy的精确最小化是NP困难,很难获得这个问题的系统答案。在本文中,我们提出了一种新颖的搜索方法和一种新的学习能源功能的方法,从训练数据基于树重复信仰传播(TRBP)。我们发现TRBP可以在几分钟的计算中找到ROSETTA能量功能的全球最佳,以便在标准基准集中的大约85%的蛋白质中找到。 TBP还可以有效地绑定分区功能,该函数能够使用条件随机字段(CRF)框架来学习。有趣的是,发现全局最小值不会显着改善基于Rosetta默认能源术语(小于0.1%)的能量功能的侧链预测,同时学习新重量从72%到78%的显着增加。使用最近修改的Rosetta能量功能,具有更柔和的Lennard-Jones排斥项,全局最优可提高预测精度从77%到78%。在这里,学习新重量甚至可以进一步提高侧链建模至80%。最后,使用延长的旋转式库和CRF学习权重获得最高精度(82.6%)。我们的研究结果表明,结合机器学习与众约推论的机器学习可以改善侧链预测中的最先进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号